Displaying 20 results from an estimated 6000 matches similar to: "[LLVMdev] ARM assembler bug on LLVM 3.5"
2014 Sep 22
3
[LLVMdev] ARM assembler bug on LLVM 3.5
On Sun, 21 Sep 2014, Renato Golin wrote:
> On 20 September 2014 15:19, Mikulas Patocka
> <mikulas at artax.karlin.mff.cuni.cz> wrote:
> > The problem is this - you either compile this program with
> > -mcpu=cortex-a9, then clang reports error on the sdiv instruction because
> > cortex a9 doesn't have sdiv. Or - you compile the program with
> >
2013 Dec 19
3
[LLVMdev] LLVM ARM VMLA instruction
Hi all,
Thanks for the info. Few observations from my side :
LLVM :
cortex-a8 vfpv3 : no vmla or vfma instruction emitted
cortex-a8 vfpv4 : no vmla or vfma instruction emitted (This is invalid
though as cortex-a8 does not have vfpv4)
cortex-a8 vfpv4 with ffp-contract=fast : vfma instruction emitted ( this
seems a bug to me!! If cortex-a8 doesn't come with vfpv4 then vfma
instructions
2013 Dec 18
2
[LLVMdev] LLVM ARM VMLA instruction
Hi,
Hi,
I was going through Code of LLVM instruction code generation for ARM. I
came across VMLA instruction hazards (Floating point multiply and
accumulate). I was comparing assembly code emitted by LLVM and GCC, where i
saw that GCC was happily using VMLA instruction for floating point while
LLVM never used it, instead it used a pair of VMUL and VADD instruction.
I wanted to know if there is
2013 Dec 19
0
[LLVMdev] LLVM ARM VMLA instruction
Just to clarify: gcc 4.8.1 generates that fma at -O2; no FP relaxation or
other flags specified.
On Wed, Dec 18, 2013 at 6:02 PM, Kay Tiong Khoo <kkhoo at perfwizard.com>wrote:
> Thanks for the explanation, Tim!
>
> gcc 4.8.1 *does* generate an fma for your code example for an x86 target
> that supports fma. I'd bet that the HW vendors' compilers do the same, but
>
2013 Dec 19
2
[LLVMdev] LLVM ARM VMLA instruction
Thanks for the explanation, Tim!
gcc 4.8.1 *does* generate an fma for your code example for an x86 target
that supports fma. I'd bet that the HW vendors' compilers do the same, but
I don't have any of those installed at the moment to test that theory. So
this is a bug in those compilers? Do you know how they justify it?
I see section 6.5 "Expressions" in the C standard, and
2012 Nov 09
2
[LLVMdev] fmac generation for cortex-a9
Hi Renato,
It's definitively not A15. Can this be the case that NEON units for cortex-A9 support it but isn't documented/recommended ?
And as mentioned before code is working !
Seb
> -----Original Message-----
> From: rengolin at gmail.com [mailto:rengolin at gmail.com] On Behalf Of
> Renato Golin
> Sent: Friday, November 09, 2012 6:27 PM
> To: Sebastien DELDON-GNB
>
2012 Nov 09
2
[LLVMdev] fmac generation for cortex-a9
Hi Bastien,
Weird gcc is generating fma for my platform STEricsson Novathor with Linaro, code works. It also works when I use LLVM to generate fma (using llc -mtriple=armv7-eabi). Maybe someone from ARM can answer the question ?
Seb
From: JF Bastien [mailto:jfb at google.com]
Sent: Friday, November 09, 2012 5:36 PM
To: Sebastien DELDON-GNB
Cc: Anitha Boyapati; llvmdev at cs.uiuc.edu
Subject:
2013 Nov 07
2
[LLVMdev] [PATCH] Do not generate nopl instruction on CPUs that don't support it.
On Tue, 5 Nov 2013, Rafael EspĂndola wrote:
> Please include a testcase with the patch.
I'm sending testcase here. Compile it with
"clang -O2 -march=k6-2 -c loop.c"
> gas uses " nopl 0x0(%eax)" for k6_2. Are you sure it is a gas bug?
Yes, it is gas bug. I should report it to binutils maintainers.
Mikulas
> On 3 November 2013 13:50, Mikulas Patocka
>
2013 Dec 18
0
[LLVMdev] LLVM ARM VMLA instruction
> I was going through Code of LLVM instruction code generation for ARM. I came
> across VMLA instruction hazards (Floating point multiply and accumulate). I
> was comparing assembly code emitted by LLVM and GCC, where i saw that GCC
> was happily using VMLA instruction for floating point while LLVM never used
> it, instead it used a pair of VMUL and VADD instruction.
It looks like
2012 Nov 09
0
[LLVMdev] fmac generation for cortex-a9
Hi Sebastien,
ARMv7-M has VFMA and LLVM's "triple" is far from perfect.
Wikipedia tells me NovaThor can also be A15, or STE could have cramped
a VFPv4 in it? ;) Or possibly, your code never branches into the VFMA.
Many things could be happening, but usually, VFMA shouldn't be
generated for A9.
A GCC bug, maybe?
On 9 November 2012 16:51, Sebastien DELDON-GNB
2013 Dec 19
0
[LLVMdev] LLVM ARM VMLA instruction
> cortex-a8 vfpv4 with ffp-contract=fast : vfma instruction emitted ( this
> seems a bug to me!! If cortex-a8 doesn't come with vfpv4 then vfma
> instructions generated will be invalid )
If I'm understanding correctly, you've specifically told it this
Cortex-A8 *does* come with vfpv4. Those kinds of odd combinations can
be useful sometimes (if only for tests), so I'm not
2012 Nov 09
0
[LLVMdev] fmac generation for cortex-a9
cat /proc/cpuinfo ?
Are you sure it's generating VFMA and not VMLA?
On Fri, Nov 9, 2012 at 9:35 AM, Sebastien DELDON-GNB <
sebastien.deldon at st.com> wrote:
> Hi Renato,
>
> It's definitively not A15. Can this be the case that NEON units for
> cortex-A9 support it but isn't documented/recommended ?
> And as mentioned before code is working !
>
> Seb
>
2013 Nov 03
2
[LLVMdev] [PATCH] Do not generate nopl instruction on CPUs that don't support it.
Hi
This patch fixes code generation bug - 586-class CPUs don't support the
nopl instruction and some 686-class CPUs don't support it too.
I created bug 17792 for that.
BTW. I think you should also optimize padding on these CPUs - instead of a
stream of 0x90 nops, you should generate variants of "lea (%esi), %esi"
instruction like gcc.
This patch disables generation of
2013 Dec 18
2
[LLVMdev] LLVM ARM VMLA instruction
On 18 December 2013 09:42, Tim Northover <t.p.northover at gmail.com> wrote:
> That means one strictly newer than
> cortex-a8: cortex-a7 (don't ask), cortex-a9, cortex-a12, cortex-a15 or
> krait I believe.
>
Hi Tim,
Cortex A8 and A9 use VFPv3. A7, A12 and A15 use VFPv4.
cheers,
--renato
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
2013 Nov 12
0
[LLVMdev] [PATCH] Do not generate nopl instruction on CPUs that don't support it.
On 7 November 2013 18:31, Mikulas Patocka
<mikulas at artax.karlin.mff.cuni.cz> wrote:
>
>
> On Tue, 5 Nov 2013, Rafael EspĂndola wrote:
>
>> Please include a testcase with the patch.
>
> I'm sending testcase here. Compile it with
> "clang -O2 -march=k6-2 -c loop.c"
The test should be in the patch itself. It can use llvm-mc to check
how the nops are
2013 Nov 05
0
[LLVMdev] [PATCH] Do not generate nopl instruction on CPUs that don't support it.
Please include a testcase with the patch.
gas uses " nopl 0x0(%eax)" for k6_2. Are you sure it is a gas bug?
On 3 November 2013 13:50, Mikulas Patocka
<mikulas at artax.karlin.mff.cuni.cz> wrote:
> Hi
>
> This patch fixes code generation bug - 586-class CPUs don't support the
> nopl instruction and some 686-class CPUs don't support it too.
>
> I
2013 Dec 19
3
[LLVMdev] LLVM ARM VMLA instruction
Test case name :
>> llvm/projects/test-suite/SingleSource/Benchmarks/Misc/matmul_f64_4x4.c -
>> This is a 4x4 matrix multiplication, we can make small changes to make it a
>> 3x3 matrix multiplication for making things simple to understand .
>>
>
> This is one very specific case. How does that behave on all other cases?
> Normally, every big improvement comes with
2013 Dec 19
0
[LLVMdev] LLVM ARM VMLA instruction
On 19 December 2013 11:16, suyog sarda <sardask01 at gmail.com> wrote:
> Test case name :
> llvm/projects/test-suite/SingleSource/Benchmarks/Misc/matmul_f64_4x4.c -
> This is a 4x4 matrix multiplication, we can make small changes to make it a
> 3x3 matrix multiplication for making things simple to understand .
>
This is one very specific case. How does that behave on all
2013 Dec 19
2
[LLVMdev] LLVM ARM VMLA instruction
On Thu, Dec 19, 2013 at 4:36 PM, Renato Golin <renato.golin at linaro.org>wrote:
> On 19 December 2013 08:50, suyog sarda <sardask01 at gmail.com> wrote:
>
>> It may seem that total number of cycles are more or less same for single
>> vmla and vmul+vadd. However, when vmul+vadd combination is used instead of
>> vmla, then intermediate results will be generated
2009 Dec 01
0
[LLVMdev] thumb2 has divide instructions
I'm working with a Cortex-M3 core which is v7-M profile, and it has udiv and sdiv.
bagel
Jim Grosbach wrote:
> Hello,
>
> As I understand it, the divide instructions are only available on the
> v7-R profile of the v7 architecture. Is that incorrect?
>
> -Jim