thr3ads.net - search: "vmls"

Displaying 14 results from an estimated 14 matches for "vmls".

Did you mean: vals

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

2013 Feb 08

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

...nderstanding what this option is meant for. So here are my new questions: Why for cortex-a9 vmlx-forwarding is enabled by default ? Is it to guarantee correctness or for performance purpose ? I've made some experiments and DISABLING vmlx-forwarding for cortex-a9 leads to generation of more vmla/vmls .f32 and significantly improve some benchmarks. I've not enter into a case where it significantly degrades performance or give incorrect answers. Thus my goal is to use my front-end to generate llvm neon intrinsics that maps to LLVM vmla/vmls f32 when I think it is appropriate and not to rely...

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

2013 Feb 08

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

...tien DELDON-GNB <sebastien.deldon at st.com>wrote: > Why for cortex-a9 vmlx-forwarding is enabled by default ? Is it to > guarantee correctness or for performance purpose ? I’ve made some > experiments and DISABLING vmlx-forwarding for cortex-a9 leads to generation > of more vmla/vmls .f32 and significantly improve some benchmarks. I’ve not > enter into a case where it significantly degrades performance or give > incorrect answers. > I believe this is what you're looking for: http://article.gmane.org/gmane.comp.compilers.llvm.cvs/90709 Performance only, but if y...

[LLVMdev] fmac generation for cortex-a9

2012 Nov 08

[LLVMdev] fmac generation for cortex-a9

Hi all, I've a .ll code that use double precision fmul/fadd or fmul/fsub. When I compile it using llc -mcpu=cortex-a9 I couldn't get vmla/vmls generated even using -fp-contract=fast, but when I use option -mtriple=armv7-eabi instead of -mcpu=cortex-a9 fused mac are generated. Can someone explain me why ? Thanks for your answers Seb -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/...

[LLVMdev] RE : fmac generation for cortex-a9

2012 Nov 12

[LLVMdev] RE : fmac generation for cortex-a9

Hi Renato, You're right it's VMLA/VMLS that are generated. Still don't understand what drives generation for Cortex-A9. I was using fmac for floating point MAC not for fused MAC. Than I realized that we spoke about fma instead of fmac. So back to the original problem why when using -mcpu=cortex-a9 VMLA/VMLS are not generated and w...

[LLVMdev] fmac generation for cortex-a9

2012 Nov 09

[LLVMdev] fmac generation for cortex-a9

...> >> > > >> > > >> > > >> > > >> > > >> > I've a .ll code that use double precision fmul/fadd or fmul/fsub. > >> > When > > > >> > I compile it using llc -mcpu=cortex-a9 I couldn't get vmla/vmls > >> > generated even using -fp-contract=fast, but when I use option > >> > -mtriple=armv7-eabi instead of -mcpu=cortex-a9 fused mac are > > > >> generated. Can someone explain me why ? > >> > > >> > >> Perhaps you need to use so...

[LLVMdev] fmac generation for cortex-a9

2012 Nov 09

[LLVMdev] fmac generation for cortex-a9

...t; > > >> > > > >> > > > >> > > > >> > I've a .ll code that use double precision fmul/fadd or fmul/fsub. > > >> > When > > > > > >> > I compile it using llc -mcpu=cortex-a9 I couldn't get vmla/vmls > > >> > generated even using -fp-contract=fast, but when I use option > > >> > -mtriple=armv7-eabi instead of -mcpu=cortex-a9 fused mac are > > > > > >> generated. Can someone explain me why ? > > >> > > > >> > >...

[LLVMdev] fmac generation for cortex-a9

2012 Nov 08

[LLVMdev] fmac generation for cortex-a9

...3:56, Sebastien DELDON-GNB > <sebastien.deldon at st.com> wrote: > > Hi all, > > > > > > > > > > > > I've a .ll code that use double precision fmul/fadd or fmul/fsub. When > > I compile it using llc -mcpu=cortex-a9 I couldn't get vmla/vmls > > generated even using -fp-contract=fast, but when I use option > > -mtriple=armv7-eabi instead of -mcpu=cortex-a9 fused mac are > generated. Can someone explain me why ? > > > > Perhaps you need to use some attributes. -mattr=+vfp4 Check fusedMAC.ll > from ARM cod...

[LLVMdev] fmac generation for cortex-a9

2012 Nov 08

[LLVMdev] fmac generation for cortex-a9

On 8 November 2012 13:56, Sebastien DELDON-GNB <sebastien.deldon at st.com> wrote: > Hi all, > > > > > > I’ve a .ll code that use double precision fmul/fadd or fmul/fsub. When I > compile it using llc –mcpu=cortex-a9 I couldn’t get vmla/vmls generated even > using –fp-contract=fast, but when I use option –mtriple=armv7-eabi instead > of –mcpu=cortex-a9 fused mac are generated. Can someone explain me why ? > Perhaps you need to use some attributes. -mattr=+vfp4 Check fusedMAC.ll from ARM codegen tests. -- Anitha

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

2013 Feb 08

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

On 8 February 2013 10:40, Sebastien DELDON-GNB <sebastien.deldon at st.com>wrote: > Hi all,**** > > ** ** > > Everything is in the tile, I would like to enforce generation of vmla.f32 > instruction for scalar operations on cortex-a9, so is there a LLMV neon > intrinsic available for that ?**** > > Hi Sebastien, LLVM doesn't use intrinsics when there is a

[LLVMdev] fmac generation for cortex-a9

2012 Nov 09

[LLVMdev] fmac generation for cortex-a9

...tien.deldon at st.com<mailto:sebastien.deldon at st.com>> wrote: > > Hi all, > > > > > > > > > > > > I've a .ll code that use double precision fmul/fadd or fmul/fsub. When > > I compile it using llc -mcpu=cortex-a9 I couldn't get vmla/vmls > > generated even using -fp-contract=fast, but when I use option > > -mtriple=armv7-eabi instead of -mcpu=cortex-a9 fused mac are > generated. Can someone explain me why ? > > > > Perhaps you need to use some attributes. -mattr=+vfp4 Check fusedMAC.ll > from ARM code...

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

2013 Feb 08

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

Hi all, Everything is in the tile, I would like to enforce generation of vmla.f32 instruction for scalar operations on cortex-a9, so is there a LLMV neon intrinsic available for that ? Thanks for your answers Best Regards Seb -------------- next part -------------- An HTML attachment was scrubbed... URL:

[LLVMdev] fmac generation for cortex-a9

2012 Nov 09

[LLVMdev] fmac generation for cortex-a9

...stien.deldon at st.com> wrote: > > > Hi all, > > > > > > > > > > > > > > > > > > I've a .ll code that use double precision fmul/fadd or fmul/fsub. When > > > I compile it using llc -mcpu=cortex-a9 I couldn't get vmla/vmls > > > generated even using -fp-contract=fast, but when I use option > > > -mtriple=armv7-eabi instead of -mcpu=cortex-a9 fused mac are > > generated. Can someone explain me why ? > > > > > > > Perhaps you need to use some attributes. -mattr=+vfp4 Check...

[LLVMdev] fmac generation for cortex-a9

2012 Nov 09

[LLVMdev] fmac generation for cortex-a9

...ebastien.deldon at st.com> wrote: >> > Hi all, >> > >> > >> > >> > >> > >> > I've a .ll code that use double precision fmul/fadd or fmul/fsub. When > >> > I compile it using llc -mcpu=cortex-a9 I couldn't get vmla/vmls >> > generated even using -fp-contract=fast, but when I use option >> > -mtriple=armv7-eabi instead of -mcpu=cortex-a9 fused mac are > >> generated. Can someone explain me why ? >> > >> >> Perhaps you need to use some attributes. -mattr=+vfp4 Check f...

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

2013 Feb 11

[LLVMdev] Is there any llvm neon intrinsic that maps to vmla.f32 instruction ?

...tien.deldon at st.com<mailto:sebastien.deldon at st.com>> wrote: Why for cortex-a9 vmlx-forwarding is enabled by default ? Is it to guarantee correctness or for performance purpose ? I've made some experiments and DISABLING vmlx-forwarding for cortex-a9 leads to generation of more vmla/vmls .f32 and significantly improve some benchmarks. I've not enter into a case where it significantly degrades performance or give incorrect answers. I believe this is what you're looking for: http://article.gmane.org/gmane.comp.compilers.llvm.cvs/90709 Performance only, but if you're s...

search for: vmls