Phil Wang
2014-Dec-24 06:29 UTC
[opus] [Opus][RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?
Hi, I am working on DSP module of Ne10. I see there are fixed-point and floating-point FFT inside Opus. Is fixed-point FFT only a fall back for CPU without VFP? On ARMv7-A and ARMv8-A, benchmark result shows that fixed-point (int32) and floating-point (float32) FFT have similar performance. I guess fixed-point version is not often used on these platforms. Is it worth the effort to NEON-optimize fixed-point FFT? Best Regards, Phil Wang -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590 ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.xiph.org/pipermail/opus/attachments/20141224/b8932838/attachment.htm
Viswanath Puttagunta
2014-Dec-24 13:22 UTC
[opus] [Opus][RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?
Hi Phil, **feedback based on my relatively recent experience with opus... so any one with more experience with opus.. please correct/add as necessary** looking at configure.ac AC_DEFINE([FIXED_POINT], [1], [Compile as fixed-point (for machines without a fast enough FPU)]) So, FIXED_POINT is used even if CPU has NEON and FPU, but it's FPU is is slow... and looking at fixed point NEON assembly for celt_pitch_xcorr() ( celt_pitch_xcorr_arm.s).. I am assuming such a case does exist.. (Specific ARM CPUs anyone?) This leads me to believe that there is value in optimizing fft routine using NEON fixed point instructions for some low end ARMv7 CPUs. FIXED POINT in opus may not be worth it for ARMv8... I would be surprised if any one in ARM disagrees with me on this statement.... If they do.. please let me know.. it will definitely be news to me. Anycase, fyi.. I am currently only working on integrating ne10 float32 fft into opus... not looking at any fixed point optimizations at the moment. Will see opus crowd's response to gauge further. Regards, Vish On 24 December 2014 at 00:29, Phil Wang <Phil.Wang at arm.com> wrote:> Hi, > > > > I am working on DSP module of Ne10. I see there are fixed-point and > floating-point FFT inside Opus. Is fixed-point FFT only a fall back for CPU > without VFP? On ARMv7-A and ARMv8-A, benchmark result shows that fixed-point > (int32) and floating-point (float32) FFT have similar performance. I guess > fixed-point version is not often used on these platforms. Is it worth the > effort to NEON-optimize fixed-point FFT? > > > > Best Regards, > > Phil Wang > > > -- IMPORTANT NOTICE: The contents of this email and any attachments are > confidential and may also be privileged. If you are not the intended > recipient, please notify the sender immediately and do not disclose the > contents to any other person, use it for any purpose, or store or copy the > information in any medium. Thank you. > > ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, > Registered in England & Wales, Company No: 2557590 > ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, > Registered in England & Wales, Company No: 2548782 > > _______________________________________________ > opus mailing list > opus at xiph.org > http://lists.xiph.org/mailman/listinfo/opus >
Peter Robinson
2014-Dec-25 10:35 UTC
[opus] [Opus][RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?
> I am working on DSP module of Ne10. I see there are fixed-point and > floating-point FFT inside Opus. Is fixed-point FFT only a fall back for CPU > without VFP? On ARMv7-A and ARMv8-A, benchmark result shows that fixed-point > (int32) and floating-point (float32) FFT have similar performance. I guess > fixed-point version is not often used on these platforms. Is it worth the > effort to NEON-optimize fixed-point FFT?Floating point units are optional on the ARM Cortex-M series so I believe it might still be worth while. The Cortex-M3-7 are based on the ARMv7 architecture. [1] https://en.wikipedia.org/wiki/ARM_Cortex-M#Instruction_sets
Phil Wang
2014-Dec-25 10:40 UTC
[opus] [Opus][RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?
Hi Peter, We are focused on CPU with NEON, and Cortex-M series do not have one. Thanks, Phil Wang> -----Original Message----- > From: Peter Robinson [mailto:pbrobinson at gmail.com] > Sent: Thursday, December 25, 2014 6:35 PM > To: Phil Wang > Cc: opus at xiph.org; Zhongwei Yao; Yang Zhang; Zhou (Joe) Yu; Steve Bannister > Subject: Re: [opus] [Opus][RFC][FFT][Fixed-Point][NEON] NEON-Optimize > Fixed-Point FFT? > > > I am working on DSP module of Ne10. I see there are fixed-point and > > floating-point FFT inside Opus. Is fixed-point FFT only a fall back > > for CPU without VFP? On ARMv7-A and ARMv8-A, benchmark result shows > > that fixed-point > > (int32) and floating-point (float32) FFT have similar performance. I > > guess fixed-point version is not often used on these platforms. Is it > > worth the effort to NEON-optimize fixed-point FFT? > > Floating point units are optional on the ARM Cortex-M series so I believe it > might still be worth while. The Cortex-M3-7 are based on the ARMv7 > architecture. > > [1] https://en.wikipedia.org/wiki/ARM_Cortex-M#Instruction_sets
Jean-Marc Valin
2014-Dec-25 16:28 UTC
[opus] [Opus][RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?
There is definitely some use for a Neon fixed-point FFT. How much exactly I'm not sure. Fixed-point is a bit more than just a fall-back for CPUs with no FPU. There are CPUs for which fixed-point is still faster. It depends on the exact model but also on what you run. For example, even on x86 I believe that SILK encoding is slightly faster in fixed-point, even though CELT is faster in float. Cheers, Jean-Marc On 24/12/14 01:29 AM, Phil Wang wrote:> Hi, > > > > I am working on DSP module of Ne10. I see there are fixed-point and > floating-point FFT inside Opus. Is fixed-point FFT only a fall back for > CPU without VFP? On ARMv7-A and ARMv8-A, benchmark result shows that > fixed-point (int32) and floating-point (float32) FFT have similar > performance. I guess fixed-point version is not often used on these > platforms. Is it worth the effort to NEON-optimize fixed-point FFT? > > > > Best Regards, > > Phil Wang > > > -- IMPORTANT NOTICE: The contents of this email and any attachments are > confidential and may also be privileged. If you are not the intended > recipient, please notify the sender immediately and do not disclose the > contents to any other person, use it for any purpose, or store or copy > the information in any medium. Thank you. > > ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, > Registered in England & Wales, Company No: 2557590 > ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 > 9NJ, Registered in England & Wales, Company No: 2548782 > > > _______________________________________________ > opus mailing list > opus at xiph.org > http://lists.xiph.org/mailman/listinfo/opus >
Timothy B. Terriberry
2014-Dec-25 22:51 UTC
[opus] [Opus][RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?
Jean-Marc Valin wrote:> There is definitely some use for a Neon fixed-point FFT. How much > exactly I'm not sure. Fixed-point is a bit more than just a fall-backWell, we use fixed-point mode by default in Firefox for both Firefox OS and Fennec (Firefox on Android). The reason is that, although there is some NEON-class hardware where float does finally appear to be a little bit faster (e.g., recent A9's), there are still plenty where it is _significantly_ slower. So if you're going to pick one version to run on many devices, fixed-point has much better worst-case performance.
Phil Wang
2014-Dec-26 03:07 UTC
[opus] [Opus][RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?
Hi Peter, I have discussed with developers on Series-M. For Series-M, a well optimized FFT written C is good enough for most cases. Although SIMD instruction for Series-M and NEON have similar form, there is a large gap between optimizing them. CMSIS provides some well optimized DSP routines for Series-M. We don?t want to overlap with it. But since you mentioned, I will benchmark Ne10 on Series-M, with NEON disabled. Thanks, Phil Wang> -----Original Message----- > From: Phil Wang [mailto:phil.wang at arm.com] > Sent: Thursday, December 25, 2014 6:41 PM > To: 'Peter Robinson' > Cc: opus at xiph.org; Zhongwei Yao; Yang Zhang; Zhou (Joe) Yu; Steve Bannister > Subject: RE: [opus] [Opus][RFC][FFT][Fixed-Point][NEON] NEON-Optimize > Fixed-Point FFT? > > Hi Peter, > > We are focused on CPU with NEON, and Cortex-M series do not have one. > > Thanks, > Phil Wang > > > -----Original Message----- > > From: Peter Robinson [mailto:pbrobinson at gmail.com] > > Sent: Thursday, December 25, 2014 6:35 PM > > To: Phil Wang > > Cc: opus at xiph.org; Zhongwei Yao; Yang Zhang; Zhou (Joe) Yu; Steve > > Bannister > > Subject: Re: [opus] [Opus][RFC][FFT][Fixed-Point][NEON] NEON-Optimize > > Fixed-Point FFT? > > > > > I am working on DSP module of Ne10. I see there are fixed-point and > > > floating-point FFT inside Opus. Is fixed-point FFT only a fall back > > > for CPU without VFP? On ARMv7-A and ARMv8-A, benchmark result shows > > > that fixed-point > > > (int32) and floating-point (float32) FFT have similar performance. I > > > guess fixed-point version is not often used on these platforms. Is > > > it worth the effort to NEON-optimize fixed-point FFT? > > > > Floating point units are optional on the ARM Cortex-M series so I > > believe it might still be worth while. The Cortex-M3-7 are based on > > the ARMv7 architecture. > > > > [1] https://en.wikipedia.org/wiki/ARM_Cortex-M#Instruction_sets
Possibly Parallel Threads
- [RFC][FFT][Fixed-Point][NEON] NEON-Optimize
- [RFC][FFT][Fixed-Point][NEON] NEON-Optimize
- [ARM][FFT][NEON] Integrate Ne10 into Opus?
- [RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?
- [RFC V3 7/8] armv7, armv8: Optimize fixed point fft using NE10 library