Hi Renato, It's definitively not A15. Can this be the case that NEON units for cortex-A9 support it but isn't documented/recommended ? And as mentioned before code is working ! Seb> -----Original Message----- > From: rengolin at gmail.com [mailto:rengolin at gmail.com] On Behalf Of > Renato Golin > Sent: Friday, November 09, 2012 6:27 PM > To: Sebastien DELDON-GNB > Cc: JF Bastien; llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] fmac generation for cortex-a9 > > Hi Sebastien, > > ARMv7-M has VFMA and LLVM's "triple" is far from perfect. > > Wikipedia tells me NovaThor can also be A15, or STE could have cramped a > VFPv4 in it? ;) Or possibly, your code never branches into the VFMA. > Many things could be happening, but usually, VFMA shouldn't be generated > for A9. > > A GCC bug, maybe? > > > On 9 November 2012 16:51, Sebastien DELDON-GNB > <sebastien.deldon at st.com> wrote: > > Hi Bastien, > > > > > > > > Weird gcc is generating fma for my platform STEricsson Novathor with > > Linaro, code works. It also works when I use LLVM to generate fma > > (using llc -mtriple=armv7-eabi). Maybe someone from ARM can answer > the question ? > > > > > > > > Seb > > > > > > > > From: JF Bastien [mailto:jfb at google.com] > > Sent: Friday, November 09, 2012 5:36 PM > > To: Sebastien DELDON-GNB > > Cc: Anitha Boyapati; llvmdev at cs.uiuc.edu > > > > > > Subject: Re: [LLVMdev] fmac generation for cortex-a9 > > > > > > > > AFAIK A9 doesn't have VFPv4 or AdvSIMDv2, so it doesn't have VFMA. I > > don't know what LLVM does, but it shouldn't emit VFMA when you target > > A9. VMLA isn't a fused multiply-add, it's a multiply followed by an > > add and has different latency as well as precision. > > > > > > > > On Thu, Nov 8, 2012 at 4:57 AM, Sebastien DELDON-GNB > > <sebastien.deldon at st.com> wrote: > > > > Hi Anitha, > > > > Thanks for your answer but -mcpu=cortex-a9 -mattr=+vfp4 doesn' t > > enable fused mac generation for me. > > I would like just to understand why -mtriple=armv7-eabi enables it > > while > > -mcpu=cortex-a9 seems to disable it ? > > > > Seb > > > > > >> -----Original Message----- > >> From: Anitha Boyapati [mailto:anitha.boyapati at gmail.com] > >> Sent: Thursday, November 08, 2012 10:22 AM > >> To: Sebastien DELDON-GNB > >> Cc: llvmdev at cs.uiuc.edu > >> Subject: Re: [LLVMdev] fmac generation for cortex-a9 > >> > >> On 8 November 2012 13:56, Sebastien DELDON-GNB > >> <sebastien.deldon at st.com> wrote: > >> > Hi all, > >> > > >> > > >> > > >> > > >> > > >> > I've a .ll code that use double precision fmul/fadd or fmul/fsub. > >> > When > > > >> > I compile it using llc -mcpu=cortex-a9 I couldn't get vmla/vmls > >> > generated even using -fp-contract=fast, but when I use option > >> > -mtriple=armv7-eabi instead of -mcpu=cortex-a9 fused mac are > > > >> generated. Can someone explain me why ? > >> > > >> > >> Perhaps you need to use some attributes. -mattr=+vfp4 Check > >> fusedMAC.ll from ARM codegen tests. > >> > >> > >> -- > >> Anitha > > > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > > > > > > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > > > -- > cheers, > --renato > > http://systemcall.org/
cat /proc/cpuinfo ? Are you sure it's generating VFMA and not VMLA? On Fri, Nov 9, 2012 at 9:35 AM, Sebastien DELDON-GNB < sebastien.deldon at st.com> wrote:> Hi Renato, > > It's definitively not A15. Can this be the case that NEON units for > cortex-A9 support it but isn't documented/recommended ? > And as mentioned before code is working ! > > Seb > > > > -----Original Message----- > > From: rengolin at gmail.com [mailto:rengolin at gmail.com] On Behalf Of > > Renato Golin > > Sent: Friday, November 09, 2012 6:27 PM > > To: Sebastien DELDON-GNB > > Cc: JF Bastien; llvmdev at cs.uiuc.edu > > Subject: Re: [LLVMdev] fmac generation for cortex-a9 > > > > Hi Sebastien, > > > > ARMv7-M has VFMA and LLVM's "triple" is far from perfect. > > > > Wikipedia tells me NovaThor can also be A15, or STE could have cramped a > > VFPv4 in it? ;) Or possibly, your code never branches into the VFMA. > > Many things could be happening, but usually, VFMA shouldn't be generated > > for A9. > > > > A GCC bug, maybe? > > > > > > On 9 November 2012 16:51, Sebastien DELDON-GNB > > <sebastien.deldon at st.com> wrote: > > > Hi Bastien, > > > > > > > > > > > > Weird gcc is generating fma for my platform STEricsson Novathor with > > > Linaro, code works. It also works when I use LLVM to generate fma > > > (using llc -mtriple=armv7-eabi). Maybe someone from ARM can answer > > the question ? > > > > > > > > > > > > Seb > > > > > > > > > > > > From: JF Bastien [mailto:jfb at google.com] > > > Sent: Friday, November 09, 2012 5:36 PM > > > To: Sebastien DELDON-GNB > > > Cc: Anitha Boyapati; llvmdev at cs.uiuc.edu > > > > > > > > > Subject: Re: [LLVMdev] fmac generation for cortex-a9 > > > > > > > > > > > > AFAIK A9 doesn't have VFPv4 or AdvSIMDv2, so it doesn't have VFMA. I > > > don't know what LLVM does, but it shouldn't emit VFMA when you target > > > A9. VMLA isn't a fused multiply-add, it's a multiply followed by an > > > add and has different latency as well as precision. > > > > > > > > > > > > On Thu, Nov 8, 2012 at 4:57 AM, Sebastien DELDON-GNB > > > <sebastien.deldon at st.com> wrote: > > > > > > Hi Anitha, > > > > > > Thanks for your answer but -mcpu=cortex-a9 -mattr=+vfp4 doesn' t > > > enable fused mac generation for me. > > > I would like just to understand why -mtriple=armv7-eabi enables it > > > while > > > -mcpu=cortex-a9 seems to disable it ? > > > > > > Seb > > > > > > > > >> -----Original Message----- > > >> From: Anitha Boyapati [mailto:anitha.boyapati at gmail.com] > > >> Sent: Thursday, November 08, 2012 10:22 AM > > >> To: Sebastien DELDON-GNB > > >> Cc: llvmdev at cs.uiuc.edu > > >> Subject: Re: [LLVMdev] fmac generation for cortex-a9 > > >> > > >> On 8 November 2012 13:56, Sebastien DELDON-GNB > > >> <sebastien.deldon at st.com> wrote: > > >> > Hi all, > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > I've a .ll code that use double precision fmul/fadd or fmul/fsub. > > >> > When > > > > > >> > I compile it using llc -mcpu=cortex-a9 I couldn't get vmla/vmls > > >> > generated even using -fp-contract=fast, but when I use option > > >> > -mtriple=armv7-eabi instead of -mcpu=cortex-a9 fused mac are > > > > > >> generated. Can someone explain me why ? > > >> > > > >> > > >> Perhaps you need to use some attributes. -mattr=+vfp4 Check > > >> fusedMAC.ll from ARM codegen tests. > > >> > > >> > > >> -- > > >> Anitha > > > > > > _______________________________________________ > > > LLVM Developers mailing list > > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > > > > > > > > > > > > _______________________________________________ > > > LLVM Developers mailing list > > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > > > > > > > > -- > > cheers, > > --renato > > > > http://systemcall.org/ >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121109/1011f3a3/attachment.html>
Hi Renato, You're right it's VMLA/VMLS that are generated. Still don't understand what drives generation for Cortex-A9. I was using fmac for floating point MAC not for fused MAC. Than I realized that we spoke about fma instead of fmac. So back to the original problem why when using -mcpu=cortex-a9 VMLA/VMLS are not generated and when I use -mtriple=armv7-eabi they are ? Best Regards Seb ________________________________________ De : JF Bastien [jfb at google.com] Date d'envoi : vendredi 9 novembre 2012 18:45 À : Sebastien DELDON-GNB Cc : Renato Golin; llvmdev at cs.uiuc.edu Objet : Re: [LLVMdev] fmac generation for cortex-a9 cat /proc/cpuinfo ? Are you sure it's generating VFMA and not VMLA? On Fri, Nov 9, 2012 at 9:35 AM, Sebastien DELDON-GNB <sebastien.deldon at st.com<mailto:sebastien.deldon at st.com>> wrote: Hi Renato, It's definitively not A15. Can this be the case that NEON units for cortex-A9 support it but isn't documented/recommended ? And as mentioned before code is working ! Seb> -----Original Message----- > From: rengolin at gmail.com<mailto:rengolin at gmail.com> [mailto:rengolin at gmail.com<mailto:rengolin at gmail.com>] On Behalf Of > Renato Golin > Sent: Friday, November 09, 2012 6:27 PM > To: Sebastien DELDON-GNB > Cc: JF Bastien; llvmdev at cs.uiuc.edu<mailto:llvmdev at cs.uiuc.edu> > Subject: Re: [LLVMdev] fmac generation for cortex-a9 > > Hi Sebastien, > > ARMv7-M has VFMA and LLVM's "triple" is far from perfect. > > Wikipedia tells me NovaThor can also be A15, or STE could have cramped a > VFPv4 in it? ;) Or possibly, your code never branches into the VFMA. > Many things could be happening, but usually, VFMA shouldn't be generated > for A9. > > A GCC bug, maybe? > > > On 9 November 2012 16:51, Sebastien DELDON-GNB > <sebastien.deldon at st.com<mailto:sebastien.deldon at st.com>> wrote: > > Hi Bastien, > > > > > > > > Weird gcc is generating fma for my platform STEricsson Novathor with > > Linaro, code works. It also works when I use LLVM to generate fma > > (using llc -mtriple=armv7-eabi). Maybe someone from ARM can answer > the question ? > > > > > > > > Seb > > > > > > > > From: JF Bastien [mailto:jfb at google.com<mailto:jfb at google.com>] > > Sent: Friday, November 09, 2012 5:36 PM > > To: Sebastien DELDON-GNB > > Cc: Anitha Boyapati; llvmdev at cs.uiuc.edu<mailto:llvmdev at cs.uiuc.edu> > > > > > > Subject: Re: [LLVMdev] fmac generation for cortex-a9 > > > > > > > > AFAIK A9 doesn't have VFPv4 or AdvSIMDv2, so it doesn't have VFMA. I > > don't know what LLVM does, but it shouldn't emit VFMA when you target > > A9. VMLA isn't a fused multiply-add, it's a multiply followed by an > > add and has different latency as well as precision. > > > > > > > > On Thu, Nov 8, 2012 at 4:57 AM, Sebastien DELDON-GNB > > <sebastien.deldon at st.com<mailto:sebastien.deldon at st.com>> wrote: > > > > Hi Anitha, > > > > Thanks for your answer but -mcpu=cortex-a9 -mattr=+vfp4 doesn' t > > enable fused mac generation for me. > > I would like just to understand why -mtriple=armv7-eabi enables it > > while > > -mcpu=cortex-a9 seems to disable it ? > > > > Seb > > > > > >> -----Original Message----- > >> From: Anitha Boyapati [mailto:anitha.boyapati at gmail.com<mailto:anitha.boyapati at gmail.com>] > >> Sent: Thursday, November 08, 2012 10:22 AM > >> To: Sebastien DELDON-GNB > >> Cc: llvmdev at cs.uiuc.edu<mailto:llvmdev at cs.uiuc.edu> > >> Subject: Re: [LLVMdev] fmac generation for cortex-a9 > >> > >> On 8 November 2012 13:56, Sebastien DELDON-GNB > >> <sebastien.deldon at st.com<mailto:sebastien.deldon at st.com>> wrote: > >> > Hi all, > >> > > >> > > >> > > >> > > >> > > >> > I've a .ll code that use double precision fmul/fadd or fmul/fsub. > >> > When > > > >> > I compile it using llc -mcpu=cortex-a9 I couldn't get vmla/vmls > >> > generated even using -fp-contract=fast, but when I use option > >> > -mtriple=armv7-eabi instead of -mcpu=cortex-a9 fused mac are > > > >> generated. Can someone explain me why ? > >> > > >> > >> Perhaps you need to use some attributes. -mattr=+vfp4 Check > >> fusedMAC.ll from ARM codegen tests. > >> > >> > >> -- > >> Anitha > > > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu<mailto:LLVMdev at cs.uiuc.edu> http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > > > > > > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu<mailto:LLVMdev at cs.uiuc.edu> http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > > > -- > cheers, > --renato > > http://systemcall.org/