Displaying 20 results from an estimated 10000 matches similar to: "[LLVMdev] Has something changed in LLVM BUG tracking system ?"
2012 Nov 12
1
[LLVMdev] RE : fmac generation for cortex-a9
Hi Renato,
You're right it's VMLA/VMLS that are generated. Still don't understand what drives generation for Cortex-A9.
I was using fmac for floating point MAC not for fused MAC. Than I realized that we spoke about fma instead of fmac.
So back to the original problem why when using -mcpu=cortex-a9 VMLA/VMLS are not generated and when I use -mtriple=armv7-eabi they are ?
Best
2012 Nov 09
0
[LLVMdev] fmac generation for cortex-a9
cat /proc/cpuinfo ?
Are you sure it's generating VFMA and not VMLA?
On Fri, Nov 9, 2012 at 9:35 AM, Sebastien DELDON-GNB <
sebastien.deldon at st.com> wrote:
> Hi Renato,
>
> It's definitively not A15. Can this be the case that NEON units for
> cortex-A9 support it but isn't documented/recommended ?
> And as mentioned before code is working !
>
> Seb
>
2012 Nov 09
2
[LLVMdev] fmac generation for cortex-a9
Hi Renato,
It's definitively not A15. Can this be the case that NEON units for cortex-A9 support it but isn't documented/recommended ?
And as mentioned before code is working !
Seb
> -----Original Message-----
> From: rengolin at gmail.com [mailto:rengolin at gmail.com] On Behalf Of
> Renato Golin
> Sent: Friday, November 09, 2012 6:27 PM
> To: Sebastien DELDON-GNB
>
2012 Jun 04
1
[LLVMdev] llc support for ARM predication ?
Hi James,
Thanks for the answer, for Cortex-A9 would you recommend to generate thumb2 code or ARM code ? What would be the best performance wise ?
Best Regards
Seb
> -----Original Message-----
> From: James Molloy [mailto:james.molloy at arm.com]
> Sent: Thursday, May 31, 2012 9:57 AM
> To: Sebastien DELDON-GNB
> Cc: llvmdev at cs.uiuc.edu
> Subject: Re: [LLVMdev] llc support
2012 Nov 09
2
[LLVMdev] fmac generation for cortex-a9
Hi Bastien,
Weird gcc is generating fma for my platform STEricsson Novathor with Linaro, code works. It also works when I use LLVM to generate fma (using llc -mtriple=armv7-eabi). Maybe someone from ARM can answer the question ?
Seb
From: JF Bastien [mailto:jfb at google.com]
Sent: Friday, November 09, 2012 5:36 PM
To: Sebastien DELDON-GNB
Cc: Anitha Boyapati; llvmdev at cs.uiuc.edu
Subject:
2012 Nov 09
0
[LLVMdev] fmac generation for cortex-a9
Hi Sebastien,
ARMv7-M has VFMA and LLVM's "triple" is far from perfect.
Wikipedia tells me NovaThor can also be A15, or STE could have cramped
a VFPv4 in it? ;) Or possibly, your code never branches into the VFMA.
Many things could be happening, but usually, VFMA shouldn't be
generated for A9.
A GCC bug, maybe?
On 9 November 2012 16:51, Sebastien DELDON-GNB
2012 May 31
0
[LLVMdev] llc support for ARM predication ?
Hi Seb,
The ARM instruction set is a fixed-width 32-bit instruction set that has
been around since the early days of ARM.
Modern (armv4t onwards) cores mostly have another instruction set that
can be used in tandem, the "thumb" instruction set. This is a variable
width (16 or 32 bit) instruction set that provides a subset of the ARM
instruction set and was intended to provide the
2012 May 30
2
[LLVMdev] llc support for ARM predication ?
Hi James,
Thanks for the answer, can you elaborate on difference between thumb, thumb2, ARM, thumbv7.
I'm a bit lost right now. When specifying thumbv7 llc will generate thumb only code, not thumb2 ?
Best Regards
Seb
> -----Original Message-----
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu]
> On Behalf Of James Molloy
> Sent: Tuesday, May 29,
2012 Nov 08
2
[LLVMdev] fmac generation for cortex-a9
Hi Anitha,
Thanks for your answer but -mcpu=cortex-a9 -mattr=+vfp4 doesn' t enable fused mac generation for me.
I would like just to understand why -mtriple=armv7-eabi enables it while -mcpu=cortex-a9 seems to disable it ?
Seb
> -----Original Message-----
> From: Anitha Boyapati [mailto:anitha.boyapati at gmail.com]
> Sent: Thursday, November 08, 2012 10:22 AM
> To: Sebastien
2012 Nov 09
0
[LLVMdev] fmac generation for cortex-a9
AFAIK A9 doesn't have VFPv4 or AdvSIMDv2, so it doesn't have VFMA. I don't
know what LLVM does, but it shouldn't emit VFMA when you target A9. VMLA
isn't a fused multiply-add, it's a multiply followed by an add and has
different latency as well as precision.
On Thu, Nov 8, 2012 at 4:57 AM, Sebastien DELDON-GNB <
sebastien.deldon at st.com> wrote:
> Hi Anitha,
2011 Feb 28
2
[LLVMdev] Use of movupd instead of movapd for x86
Understood for the aligned case, I want to measure performance degradation for unaligned case.
I mean unaligned case versus aligned. I know this is stupid, but I want to try to pass a <4 x float>* as parameter of a routine and at the call site I want to pass a misaligned pointer. Since LLVM is generating movapd instruction it will raise an exception (SEGFAULT), I just want to know if there
2011 Mar 01
0
[LLVMdev] Use of movupd instead of movapd for x86
On Feb 28, 2011, at 2:58 AM, Sebastien DELDON-GNB wrote:
> Understood for the aligned case, I want to measure performance degradation for unaligned case.
> I mean unaligned case versus aligned. I know this is stupid, but I want to try to pass a <4 x float>* as parameter of a routine and at the call site I want to pass a misaligned pointer. Since LLVM is generating movapd instruction
2013 Feb 15
1
[LLVMdev] RE : Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?
Hi Renato,
No I've used LNT before and it might not be as simple as you think to get it working here. I'll see what I can do, but It's unlikely I'll have much time to spend on this topic in the coming weeks.
I'm more interested coming back to my original question, and would like to know how to proceed if I want to define my own LLVM intrinsic to generate VMLA instruction. My
2011 Feb 25
3
[LLVMdev] Use of movupd instead of movapd for x86
Hi all,
Is there a way to force llc to generate movupd instruction instead of movapd for x86 target ?
I know that movapd is more performant, but I would like to measure degradation when alignment constraints are not met.
Best Regards
Seb
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
2013 Feb 15
2
[LLVMdev] RE : Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?
Hi Lang & Renato,
I eventually set up a panda board with latest linaro delivery (eabi-hf). I did some experiments using my own compiler and LLVM 3.2 as back-end.
I use same flagset for my compiler (front-end) and just invoke llc with and without vmlx-forwarding attribute. So base arguments to llc are:
llc -march=arm -mcpu=cortex-a9 -mattr=+neon -float-abi=hard
to which I added
2012 Jul 05
3
[LLVMdev] Vector argument passing abi for ARM ?
Hi Rotem,
Thanks for the quick answer, how do I know which type is legal/illegal with respect to calling convention ?
Best Regards
Seb
> -----Original Message-----
> From: Rotem, Nadav [mailto:nadav.rotem at intel.com]
> Sent: Thursday, July 05, 2012 11:21 AM
> To: Sebastien DELDON-GNB; llvmdev at cs.uiuc.edu
> Subject: RE: Vector argument passing abi for ARM ?
>
> The
2012 Jul 27
2
[LLVMdev] Question about arm thumb2 code generation
Hi all,
Does llc -march=thumb -mcpu=cortex-a9 enable generation of thumb2 code for armv7 ?
Best Regards
Seb
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120727/da758ea0/attachment.html>
2013 Feb 15
0
[LLVMdev] RE : Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?
On 15 February 2013 16:00, Sebastien DELDON-GNB <sebastien.deldon at st.com>wrote:
> to which I added –mattr=-vmlx-forwarding to disable vmlx forwarding for
> cortex-a9.
>
> When I DISABLE vmlx forwarding I’m observing a 7% speed-up on ref dataset
> for MILC. So I’m observing something similar to what I’ve observed on STE
> platform available on SNOWBALL board.
>
Hi
2012 Sep 21
0
[LLVMdev] Question about LLVM NEON intrinsics
On Sep 21, 2012, at 2:58 AM, Sebastien DELDON-GNB <sebastien.deldon at st.com> wrote:
> Hi Eli,
>
> Thanks for the answer, it clarifies the situation for me. Do you know if there is Pass in LLVM that could be adapted to 'legalize' intrinsics calls ?
> Or shall I define my own intrinsics for non supported types ?
You should never generate these sorts of intrinsics with
2013 Feb 19
2
[LLVMdev] Is it a bug or am I missing something ?
Hi all,
on following code:
; ModuleID = 'shufxbug.ll'
target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:32:32-n8:16:32"
target triple = "i386-pc-linux-gnu"
define void @sample_test(<4 x float>* nocapture %source, <8 x float>* nocapture %dest) nounwind noinline {
L.entry: