Thanks, that helps a lot.> All chips (to date) with NEON have VFP3, so it's safe to assume that a-mfpu=neon will have VFP3, so all the decisions> about code generated for VFP3 can safely be assumed by targets withNEON. Just to confirm my understanding, can I correctly say in general that the llc code generator might blur distinctions between NEON and VFP3 when it can do so safely? -David -----Original Message----- From: rengolin at gmail.com [mailto:rengolin at gmail.com] On Behalf Of Renato Golin Sent: Friday, May 27, 2011 2:38 AM To: David Dunkle Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Question about ARM/vfp/NEON code generation On 27 May 2011 02:04, David Dunkle <ddunkle at arxan.com> wrote:> In all cases, I get code that looks pretty very the same; its like > what is below. However, I am expecting to see instruction level > differences between the vfp3 and neon versions. When I do the same > with gcc 4.2 I do see differences in the generated code.Hi David, You could see different instructions (as gcc does, you say), but it's not necessary. Your example has only floating point arithmetic, which both VFP3 and NEON can do, so the final assembly will be similar. If you start using integer arithmetic, than you can see vector instructions for NEON (if it's vectorized) and not for VFP3. All chips (to date) with NEON have VFP3, so it's safe to assume that a -mfpu=neon will have VFP3, so all the decisions about code generated for VFP3 can safely be assumed by targets with NEON. Hope that answers your questions. cheers, --renato
On May 27, 2011, at 10:49 AM, David Dunkle wrote:> Thanks, that helps a lot. > >> All chips (to date) with NEON have VFP3, so it's safe to assume that a > -mfpu=neon will have VFP3, so all the decisions >> about code generated for VFP3 can safely be assumed by targets with > NEON. > > Just to confirm my understanding, can I correctly say in general that > the llc code generator might blur distinctions between NEON and VFP3 > when it can do so safely?Not exactly. The distinction is clear, it's just not expressed as an either/or question. Specifically, the code generator considers NEON to be a proper superset of VFP3. So if it has only VFP3, that's all it will use. If it has NEON, it assumes it also has VFP3 and can use either. There's not, currently, a way to say "use only NEON instructions; don't generate any VFP3." -Jim> > -----Original Message----- > From: rengolin at gmail.com [mailto:rengolin at gmail.com] On Behalf Of Renato > Golin > Sent: Friday, May 27, 2011 2:38 AM > To: David Dunkle > Cc: llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] Question about ARM/vfp/NEON code generation > > On 27 May 2011 02:04, David Dunkle <ddunkle at arxan.com> wrote: >> In all cases, I get code that looks pretty very the same; its like >> what is below. However, I am expecting to see instruction level >> differences between the vfp3 and neon versions. When I do the same >> with gcc 4.2 I do see differences in the generated code. > > Hi David, > > You could see different instructions (as gcc does, you say), but it's > not necessary. > > Your example has only floating point arithmetic, which both VFP3 and > NEON can do, so the final assembly will be similar. If you start using > integer arithmetic, than you can see vector instructions for NEON (if > it's vectorized) and not for VFP3. > > All chips (to date) with NEON have VFP3, so it's safe to assume that a > -mfpu=neon will have VFP3, so all the decisions about code generated for > VFP3 can safely be assumed by targets with NEON. > > Hope that answers your questions. > > cheers, > --renato > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110527/27b8cea6/attachment.html>
On 27 May 2011 19:47, Jim Grosbach <grosbach at apple.com> wrote:> Not exactly. The distinction is clear, it's just not expressed as an > either/or question. Specifically, the code generator considers NEON to be a > proper superset of VFP3. So if it has only VFP3, that's all it will use. If > it has NEON, it assumes it also has VFP3 and can use either.Indeed.> There's not, > currently, a way to say "use only NEON instructions; don't generate any > VFP3."Which would be advantageous on some cases, where NEON instructions are faster than VFP3. But the way it's done today in LLVM is correct. The output doesn't have to be different between NEON and VFP3 for VFP3 operations, but it can be. GCC has some of that knowledge and it's just a matter of time for LLVM to catch up. ;) cheers, --renato
Maybe Matching Threads
- [LLVMdev] Question about ARM/vfp/NEON code generation
- [LLVMdev] Question about ARM/vfp/NEON code generation
- [LLVMdev] Question about ARM/vfp/NEON code generation
- [LLVMdev] Question about ARM/vfp/NEON code generation
- [LLVMdev] LLVM CodeGen Engineer job opening with Apple's compiler team