Displaying 20 results from an estimated 31 matches for "vfpv4".
Did you mean:
vfpv3
2013 Dec 19
3
[LLVMdev] LLVM ARM VMLA instruction
Hi all,
Thanks for the info. Few observations from my side :
LLVM :
cortex-a8 vfpv3 : no vmla or vfma instruction emitted
cortex-a8 vfpv4 : no vmla or vfma instruction emitted (This is invalid
though as cortex-a8 does not have vfpv4)
cortex-a8 vfpv4 with ffp-contract=fast : vfma instruction emitted ( this
seems a bug to me!! If cortex-a8 doesn't come with vfpv4 then vfma
instructions generated will be invalid )
cortex-a15 vfpv...
2013 Dec 19
0
[LLVMdev] LLVM ARM VMLA instruction
Just to clarify: gcc 4.8.1 generates that fma at -O2; no FP relaxation or
other flags specified.
On Wed, Dec 18, 2013 at 6:02 PM, Kay Tiong Khoo <kkhoo at perfwizard.com>wrote:
> Thanks for the explanation, Tim!
>
> gcc 4.8.1 *does* generate an fma for your code example for an x86 target
> that supports fma. I'd bet that the HW vendors' compilers do the same, but
>
2013 Dec 19
0
[LLVMdev] LLVM ARM VMLA instruction
> cortex-a8 vfpv4 with ffp-contract=fast : vfma instruction emitted ( this
> seems a bug to me!! If cortex-a8 doesn't come with vfpv4 then vfma
> instructions generated will be invalid )
If I'm understanding correctly, you've specifically told it this
Cortex-A8 *does* come with vfpv4. Those kinds...
2013 Dec 19
2
[LLVMdev] LLVM ARM VMLA instruction
Thanks for the explanation, Tim!
gcc 4.8.1 *does* generate an fma for your code example for an x86 target
that supports fma. I'd bet that the HW vendors' compilers do the same, but
I don't have any of those installed at the moment to test that theory. So
this is a bug in those compilers? Do you know how they justify it?
I see section 6.5 "Expressions" in the C standard, and
2012 Nov 09
2
[LLVMdev] fmac generation for cortex-a9
...rmv7-eabi). Maybe someone from ARM can answer the question ?
Seb
From: JF Bastien [mailto:jfb at google.com]
Sent: Friday, November 09, 2012 5:36 PM
To: Sebastien DELDON-GNB
Cc: Anitha Boyapati; llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] fmac generation for cortex-a9
AFAIK A9 doesn't have VFPv4 or AdvSIMDv2, so it doesn't have VFMA. I don't know what LLVM does, but it shouldn't emit VFMA when you target A9. VMLA isn't a fused multiply-add, it's a multiply followed by an add and has different latency as well as precision.
On Thu, Nov 8, 2012 at 4:57 AM, Sebastien DELDO...
2012 Nov 09
2
[LLVMdev] fmac generation for cortex-a9
...> Cc: JF Bastien; llvmdev at cs.uiuc.edu
> Subject: Re: [LLVMdev] fmac generation for cortex-a9
>
> Hi Sebastien,
>
> ARMv7-M has VFMA and LLVM's "triple" is far from perfect.
>
> Wikipedia tells me NovaThor can also be A15, or STE could have cramped a
> VFPv4 in it? ;) Or possibly, your code never branches into the VFMA.
> Many things could be happening, but usually, VFMA shouldn't be generated
> for A9.
>
> A GCC bug, maybe?
>
>
> On 9 November 2012 16:51, Sebastien DELDON-GNB
> <sebastien.deldon at st.com> wrote:
&...
2012 Nov 09
0
[LLVMdev] fmac generation for cortex-a9
Hi Sebastien,
ARMv7-M has VFMA and LLVM's "triple" is far from perfect.
Wikipedia tells me NovaThor can also be A15, or STE could have cramped
a VFPv4 in it? ;) Or possibly, your code never branches into the VFMA.
Many things could be happening, but usually, VFMA shouldn't be
generated for A9.
A GCC bug, maybe?
On 9 November 2012 16:51, Sebastien DELDON-GNB <sebastien.deldon at st.com> wrote:
> Hi Bastien,
>
>
>
> Weir...
2015 Feb 12
4
[LLVMdev] Building clang on Raspberry Pi2
...target-abi
aapcs-linux -mfloat-abi hard -target-linker-version 2.24.90.20141023
-dwarf-column-info -ffunction-sections -fdata-sections -coverage-file
But it's a Cortex-A7, not A8, how do I convince the test-release.sh or
build scripts to make Phase1 clang target Cortex A7 for Phase2? (with
vfpv4-d16)
Whilst we're at it, I'm not convinced that the Pi2 would make a great
cross-compilation machine, so is it possible to reduce the enabled
targets? "X86 Sparc PowerPC ARM AArch64 Mips XCore MSP430 CppBackend
NVPTX Hexagon SystemZ R600"
Thanks,
Ben
2012 Nov 09
0
[LLVMdev] fmac generation for cortex-a9
...iuc.edu
> > Subject: Re: [LLVMdev] fmac generation for cortex-a9
> >
> > Hi Sebastien,
> >
> > ARMv7-M has VFMA and LLVM's "triple" is far from perfect.
> >
> > Wikipedia tells me NovaThor can also be A15, or STE could have cramped a
> > VFPv4 in it? ;) Or possibly, your code never branches into the VFMA.
> > Many things could be happening, but usually, VFMA shouldn't be generated
> > for A9.
> >
> > A GCC bug, maybe?
> >
> >
> > On 9 November 2012 16:51, Sebastien DELDON-GNB
> > <se...
2013 Dec 18
2
[LLVMdev] LLVM ARM VMLA instruction
Hi,
Hi,
I was going through Code of LLVM instruction code generation for ARM. I
came across VMLA instruction hazards (Floating point multiply and
accumulate). I was comparing assembly code emitted by LLVM and GCC, where i
saw that GCC was happily using VMLA instruction for floating point while
LLVM never used it, instead it used a pair of VMUL and VADD instruction.
I wanted to know if there is
2013 Dec 18
0
[LLVMdev] LLVM ARM VMLA instruction
...happily using VMLA instruction for floating point while LLVM never used
> it, instead it used a pair of VMUL and VADD instruction.
It looks like Clang allows the formation by default, but you need to
be compiling for a CPU that actually supports the instruction (the key
feature is called "VFPv4". That means one strictly newer than
cortex-a8: cortex-a7 (don't ask), cortex-a9, cortex-a12, cortex-a15 or
krait I believe. With that I get:
$ cat tmp.c
float foo(float accum, float lhs, float rhs) {
return accum + lhs*rhs;
}
$ clang -target armv7-linux-gnueabihf -mcpu=cortex-a15 -S -o...
2013 Dec 18
2
[LLVMdev] LLVM ARM VMLA instruction
On 18 December 2013 09:42, Tim Northover <t.p.northover at gmail.com> wrote:
> That means one strictly newer than
> cortex-a8: cortex-a7 (don't ask), cortex-a9, cortex-a12, cortex-a15 or
> krait I believe.
>
Hi Tim,
Cortex A8 and A9 use VFPv3. A7, A12 and A15 use VFPv4.
cheers,
--renato
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131218/4d77d09c/attachment.html>
2018 Sep 25
2
[RFC] New Clang target selection options for ARM/AArch64
Hi Eli, Renato,
Thanks for your feedback, there's a lot more to some of these things than I knew. I've addressed your points below.
The overall summary is:
- Start with converting the TargetParser to tableGen, with no user facing changes
- Add warnings based on that, behind -Wall. Starting with command lines, since directives have
larger implications that need investigation
Thanks,
2012 Nov 09
0
[LLVMdev] fmac generation for cortex-a9
AFAIK A9 doesn't have VFPv4 or AdvSIMDv2, so it doesn't have VFMA. I don't
know what LLVM does, but it shouldn't emit VFMA when you target A9. VMLA
isn't a fused multiply-add, it's a multiply followed by an add and has
different latency as well as precision.
On Thu, Nov 8, 2012 at 4:57 AM, Sebastien DELD...
2012 Nov 12
1
[LLVMdev] RE : fmac generation for cortex-a9
...s.uiuc.edu<mailto:llvmdev at cs.uiuc.edu>
> Subject: Re: [LLVMdev] fmac generation for cortex-a9
>
> Hi Sebastien,
>
> ARMv7-M has VFMA and LLVM's "triple" is far from perfect.
>
> Wikipedia tells me NovaThor can also be A15, or STE could have cramped a
> VFPv4 in it? ;) Or possibly, your code never branches into the VFMA.
> Many things could be happening, but usually, VFMA shouldn't be generated
> for A9.
>
> A GCC bug, maybe?
>
>
> On 9 November 2012 16:51, Sebastien DELDON-GNB
> <sebastien.deldon at st.com<mailto:sebas...
2012 Nov 08
2
[LLVMdev] fmac generation for cortex-a9
Hi Anitha,
Thanks for your answer but -mcpu=cortex-a9 -mattr=+vfp4 doesn' t enable fused mac generation for me.
I would like just to understand why -mtriple=armv7-eabi enables it while -mcpu=cortex-a9 seems to disable it ?
Seb
> -----Original Message-----
> From: Anitha Boyapati [mailto:anitha.boyapati at gmail.com]
> Sent: Thursday, November 08, 2012 10:22 AM
> To: Sebastien
2014 Sep 20
2
[LLVMdev] ARM assembler bug on LLVM 3.5
Hi
I have the following ARM Linux program. The program detects if the
processor has division instruction, if it does, it uses it, otherwise it
uses slower library call.
The program works with gcc, but it doesn't work with clang. clang reports
error on the sdiv instruction in the assembler.
The problem is this - you either compile this program with
-mcpu=cortex-a9, then clang reports
2013 Oct 15
2
[LLVMdev] Unwanted push/pop on Cortex-M.
...lang -cc1 version 3.3 based upon LLVM 3.3 default target armv7m-none-eabi
> #include "..." search starts here:
> End of search list.
>
> $ cat a.s
> .syntax unified
> .eabi_attribute 6, 10
> .eabi_attribute 9, 2
> .eabi_attribute 10, 5
> .fpu vfpv4
> .eabi_attribute 20, 1
> .eabi_attribute 21, 1
> .eabi_attribute 23, 3
> .eabi_attribute 24, 1
> .eabi_attribute 25, 1
> .eabi_attribute 44, 1
> .file "a.c"
> .section .text.out_char,"ax",%progbits
> .globl...
2014 Sep 22
3
[LLVMdev] ARM assembler bug on LLVM 3.5
...to write portable ARM applications
that run on different cores.
> The problem is that there isn't currently a way to pass flags to the
> integrated assembler as of now (I'm working on it).
gas unlocks all instructions if these directives are used ".cpu
cortex-a15\n .fpu neon-vfpv4".
LLVM integrated assembler will choke on it and produce invalid object
file. You can try:
int main(void)
{
asm volatile (".cpu cortex-a15\n .fpu neon-vfpv4\n");
return 0;
}
/usr/bin/ld: Warning: /tmp/as-b084f5.o: Unknown EABI object attribute 488
/usr/bin/ld: /tmp/...
2017 Jun 06
2
celt_inner_prod() and dual_inner_prod() NEON intrinsics
Hi Linfeng,
On 06/06/17 04:09 PM, Jonathan Lennox wrote:
> Two comments on the various infrastructure for RTCD etc.
>
> 1. The 0002- patch changes the ABI of the celt_pitch_xcorr functions,
> but doesn’t change the assembly in celt/arm/celt_pitch_xcorr_arm.s
> correspondingly. I suspect the ‘arch’ parameter can just be ignored
> by the assembly functions, but at least the