thr3ads.net - search: "vmlx"

Displaying 20 results from an estimated 25 matches for "vmlx".

Did you mean: vmla

[LLVMdev] vmlx forwarding an cortex A9 question

2012 Dec 20

[LLVMdev] vmlx forwarding an cortex A9 question

Hi all, On following code when I use llc targeting ARM Cortex-A9 as follows, if vmlx-forwarding is turned off then 'vmla' instructions are generated. It seems that -mcpu=cortex-a9 enables it by default and thus less 'vmla' instructions are generated. On this specific example it doesn't make any difference in term of performance, but on a more complex example dis...

[LLVMdev] RE : Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

2013 Feb 15

[LLVMdev] RE : Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

Hi Lang & Renato, I eventually set up a panda board with latest linaro delivery (eabi-hf). I did some experiments using my own compiler and LLVM 3.2 as back-end. I use same flagset for my compiler (front-end) and just invoke llc with and without vmlx-forwarding attribute. So base arguments to llc are: llc -march=arm -mcpu=cortex-a9 -mattr=+neon -float-abi=hard to which I added -mattr=-vmlx-forwarding to disable vmlx forwarding for cortex-a9. When I DISABLE vmlx forwarding I'm observing a 7% speed-up on ref dataset for MILC. So I'm ob...

[LLVMdev] RE : Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

2013 Feb 12

[LLVMdev] RE : Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

On 12 February 2013 16:56, Sebastien DELDON-GNB <sebastien.deldon at st.com>wrote: > If this helps taking your decision, there are at least two benchmarks for > which disabling vmlx-forwarding makes a significant difference. > I think Evan's worry was to base this decision on visible and comprehensible benchmarks, such as the test-suite. If I get lucky I may be able to run on a panda board by next week and have > more info to share > That'd be great, thank...

[LLVMdev] RE : Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

2013 Feb 13

[LLVMdev] RE : Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

Hi Sebastien, How many extra vmlas did you see in 433.milc due to disabling -vmlx-forwarding? As I mentioned earlier, I saw only two additional integer vmlx instructions when I tested. Could you send me your 433.milc compile setup? (os, flags, compiler version, etc.). I'd like to try to reproduce your results. Cheers, Lang. On Tue, Feb 12, 2013 at 9:05 AM, Renato Golin &...

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

2013 Feb 08

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

Hi Renato, Thanks for the answer, it confirms what I was suspecting. My problem is that this behavior is controlled by vmlx forwarding on cortex-a9 for which despite asking on this list, I couldn't get a clear understanding what this option is meant for. So here are my new questions: Why for cortex-a9 vmlx-forwarding is enabled by default ? Is it to guarantee correctness or for performance purpose ? I've made so...

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

2013 Feb 11

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

Hi Bob, Seb, Renalto, My VMLA performance work was on Swift, rather than Cortex-A9. Sebastian - is vmlx-forwarding really the only variable you changed between your tests? As far as I can see the VMLx forwarding attribute only exists to restrict the application of one DAG combine optimization: PerformVMULCombine in ARMISelLowering.cpp, which turns (A + B) * C into (A * C) + (B * C). This combine onl...

[LLVMdev] RE : Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

2013 Feb 15

[LLVMdev] RE : Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

On 15 February 2013 16:00, Sebastien DELDON-GNB <sebastien.deldon at st.com>wrote: > to which I added –mattr=-vmlx-forwarding to disable vmlx forwarding for > cortex-a9. > > When I DISABLE vmlx forwarding I’m observing a 7% speed-up on ref dataset > for MILC. So I’m observing something similar to what I’ve observed on STE > platform available on SNOWBALL board. > Hi Seb, Thanks for doing thi...

[LLVMdev] vmlx forwarding option for Cortex-A9

2012 Dec 05

[LLVMdev] vmlx forwarding option for Cortex-A9

Hi all, Can someone explain me why is vmlx forwarding option enabled for cortex-a9 ? and what the purpose of it ? Best Regards Seb -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121205/6d6a7a03/attachment.html>

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

2013 Feb 11

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

...ang. On Feb 11, 2013, at 8:12 AM, Renato Golin <renato.golin at linaro.org> wrote: > On 11 February 2013 15:51, Sebastien DELDON-GNB <sebastien.deldon at st.com> wrote: > Indeed problem is with generation of vmla.f64. Affected benchmark is MILC from SPEC 2006 suite and disabling vmlx forwarding gives a 10% speed-up on complete benchmark execution ! So it is worth a try. > > > Hi Sebastien, > > Ineed, worth having a look. Including Bob Wilson (who introduced the code in the first place, and is a connoisseur of NEON in LLVM) to see if he has a better idea of the...

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

2013 Feb 08

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

On 8 February 2013 12:28, Sebastien DELDON-GNB <sebastien.deldon at st.com>wrote: > Why for cortex-a9 vmlx-forwarding is enabled by default ? Is it to > guarantee correctness or for performance purpose ? I’ve made some > experiments and DISABLING vmlx-forwarding for cortex-a9 leads to generation > of more vmla/vmls .f32 and significantly improve some benchmarks. I’ve not > enter into a case...

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

2013 Feb 12

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

Hi all, Sorry for my naïve question but what is Swift ? Yes vmlx-forwarding is the only variable I changed in my tests. I did the experiment on another popular FP benchmark and observe a 14% speed-up only by disabling vmlx-forwarding. Best Regards Seb My VMLA performance work was on Swift, rather than Cortex-A9. Sebastian - is vmlx-forwarding really the onl...

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

2013 Feb 12

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

...Evan Sent from my iPad On Feb 12, 2013, at 3:08 AM, Renato Golin <renato.golin at linaro.org> wrote: > On 12 February 2013 10:25, Sebastien DELDON-GNB <sebastien.deldon at st.com> wrote: >> Same architecture, different micro-arch (implementation). Could this be the case that vmlx-forwarding make senses for SWIFT and not for ARM Cortex-A9 implementation ? It is enabled by default when –mcpu=cortex-a9 is used but test have made show significant improvements when disabled for cortex-A9 (STEricsson Nova platform). >> > > Hi Sebastien, > > The optimization d...

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

2013 Feb 12

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

Understood, Same architecture, different micro-arch (implementation). Could this be the case that vmlx-forwarding make senses for SWIFT and not for ARM Cortex-A9 implementation ? It is enabled by default when -mcpu=cortex-a9 is used but test have made show significant improvements when disabled for cortex-A9 (STEricsson Nova platform). Best Regards Seb From: David Tweed [mailto:david.tweed at arm....

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

2013 Feb 11

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

Hi Renato, Indeed problem is with generation of vmla.f64. Affected benchmark is MILC from SPEC 2006 suite and disabling vmlx forwarding gives a 10% speed-up on complete benchmark execution ! So it is worth a try. Now going back to vmla generation through LLMV intrinsic usage. I've looked at .td file and it seems to me that when there is a "pattern" to generate instruction, no intrinsic is defined to generat...

[LLVMdev] RE : Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

2013 Feb 15

[LLVMdev] RE : Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

...ON-GNB Cc: Lang Hames; llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] RE : Is there any llvm neon intrinsic that maps to vmla.f32 instruction ? On 15 February 2013 16:00, Sebastien DELDON-GNB <sebastien.deldon at st.com<mailto:sebastien.deldon at st.com>> wrote: to which I added -mattr=-vmlx-forwarding to disable vmlx forwarding for cortex-a9. When I DISABLE vmlx forwarding I'm observing a 7% speed-up on ref dataset for MILC. So I'm observing something similar to what I've observed on STE platform available on SNOWBALL board. Hi Seb, Thanks for doing this, as we expected,...

[LLVMdev] RE : Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

2013 Feb 12

[LLVMdev] RE : Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

If this helps taking your decision, there are at least two benchmarks for which disabling vmlx-forwarding makes a significant difference. If I get lucky I may be able to run on a panda board by next week and have more info to share Best Regards Seb ________________________________________ De : Evan Cheng [evan.cheng at apple.com] Date d'envoi : mardi 12 février 2013 16:47 À : Renato Gol...

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

2013 Feb 12

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

On 12 February 2013 10:25, Sebastien DELDON-GNB <sebastien.deldon at st.com>wrote: > Same architecture, different micro-arch (implementation). Could this be > the case that vmlx-forwarding make senses for SWIFT and not for ARM > Cortex-A9 implementation ? It is enabled by default when –mcpu=cortex-a9 is > used but test have made show significant improvements when disabled for > cortex-A9 (STEricsson Nova platform). > Hi Sebastien, The optimization does make s...

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

2013 Feb 08

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

On 8 February 2013 10:40, Sebastien DELDON-GNB <sebastien.deldon at st.com>wrote: > Hi all,**** > > ** ** > > Everything is in the tile, I would like to enforce generation of vmla.f32 > instruction for scalar operations on cortex-a9, so is there a LLMV neon > intrinsic available for that ?**** > > Hi Sebastien, LLVM doesn't use intrinsics when there is a

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

2013 Feb 08

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

Hi all, Everything is in the tile, I would like to enforce generation of vmla.f32 instruction for scalar operations on cortex-a9, so is there a LLMV neon intrinsic available for that ? Thanks for your answers Best Regards Seb -------------- next part -------------- An HTML attachment was scrubbed... URL:

[LLVMdev] LLVM ARM VMLA instruction

2013 Dec 18

[LLVMdev] LLVM ARM VMLA instruction

Hi, I was going through Code of LLVM instruction code generation for ARM. I came across VMLA instruction hazards (Floating point multiply and accumulate). I was comparing assembly code emitted by LLVM and GCC, where i saw that GCC was happily using VMLA instruction for floating point while LLVM never used it, instead it used a pair of VMUL and VADD instruction. I wanted to know if there is any

search for: vmlx