Viswanath Puttagunta
2015-Jan-30 17:00 UTC
[opus] fixed point version for celt_pitch_xcorr on aarch64
On 30 January 2015 at 02:41, Timothy B. Terriberry <tterribe at xiph.org> wrote:> Zhongwei Yao wrote: >> Hi, all, >> >> Does Opus need celt_pitch_xcorr? s fixed point version for ARM aarch64 >> architecture? If yes, which version does Opus prefer: assembly or >> instrinsics? > > It would be nice to have one. I don't have a lot of experience with > aarch64 (I still haven't been able to obtain a dev board), so I don't > really know how intrinsics compare to assembly. Historically, intrinsics > performance on most platforms has been significantly below that of > hand-written assembly, and the tool support is more of a headache, which > is why we've favored hand-written assembly, but getting some kind of > vectorization is better than the serial code we currently have.Could you please elaborate on "It would be nice to have"? Specifically: - Are there use cases where fixed point is preferred when AAarch64 has mandatory support for floating point both in regular CPU as well as NEON? - Does using Fixed Point on any CPU (regardless of ARMv7/ARMv8 or otherwise) have notable advantages over using floating point? (performance, compatibility or otherwise) - If yes to above question, does same logic apply to ARMv8?> _______________________________________________ > opus mailing list > opus at xiph.org > http://lists.xiph.org/mailman/listinfo/opus
Timothy B. Terriberry
2015-Feb-01 22:26 UTC
[opus] fixed point version for celt_pitch_xcorr on aarch64
Viswanath Puttagunta wrote:> Could you please elaborate on "It would be nice to have"? Specifically: > - Are there use cases where fixed point is preferred when AAarch64 has > mandatory support for floating point both in regular CPU as well as > NEON?Even on x86, when the complexity setting is below the maximum, then for medium-low bitrate speech the fixed-point encoder will generally be faster, because much of the SILK processing uses exact integer math, and this avoids several float->int->float round trips. That's why the x86 intrinsics code Cisco contributed focused on fixed-point, for example. I have not tested on aarch64, but I expect similar properties to hold. It also depends on the rest of your pipeline. If your whole audio pipeline is fixed-point (which is common in real-time stacks), then you'd have to pay additional conversion penalties to use the floating-point API.
Jean-Marc Valin
2015-Feb-02 01:53 UTC
[opus] fixed point version for celt_pitch_xcorr on aarch64
Speaking of conversion, do we actually have Neon code for int<->float conversion? Jean-Marc On 01/02/15 05:26 PM, Timothy B. Terriberry wrote:> Viswanath Puttagunta wrote: >> Could you please elaborate on "It would be nice to have"? Specifically: >> - Are there use cases where fixed point is preferred when AAarch64 has >> mandatory support for floating point both in regular CPU as well as >> NEON? > > Even on x86, when the complexity setting is below the maximum, then for > medium-low bitrate speech the fixed-point encoder will generally be > faster, because much of the SILK processing uses exact integer math, and > this avoids several float->int->float round trips. That's why the x86 > intrinsics code Cisco contributed focused on fixed-point, for example. > > I have not tested on aarch64, but I expect similar properties to hold. > > It also depends on the rest of your pipeline. If your whole audio > pipeline is fixed-point (which is common in real-time stacks), then > you'd have to pay additional conversion penalties to use the > floating-point API. > _______________________________________________ > opus mailing list > opus at xiph.org > http://lists.xiph.org/mailman/listinfo/opus >
Viswanath Puttagunta
2015-Feb-02 17:31 UTC
[opus] fixed point version for celt_pitch_xcorr on aarch64
On 1 February 2015 at 16:26, Timothy B. Terriberry <tterribe at xiph.org> wrote:> Viswanath Puttagunta wrote: > >> Could you please elaborate on "It would be nice to have"? Specifically: >> - Are there use cases where fixed point is preferred when AAarch64 has >> mandatory support for floating point both in regular CPU as well as >> NEON? >> > > Even on x86, when the complexity setting is below the maximum, then for > medium-low bitrate speech the fixed-point encoder will generally be faster, > because much of the SILK processing uses exact integer math, and this > avoids several float->int->float round trips. That's why the x86 intrinsics > code Cisco contributed focused on fixed-point, for example. > > I have not tested on aarch64, but I expect similar properties to hold. > > It also depends on the rest of your pipeline. If your whole audio pipeline > is fixed-point (which is common in real-time stacks), then you'd have to > pay additional conversion penalties to use the floating-point API. >Thanks.. I will add adding neon capabilities to fixed point to by todo list. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.xiph.org/pipermail/opus/attachments/20150202/5de43659/attachment.htm
Possibly Parallel Threads
- fixed point version for celt_pitch_xcorr on aarch64
- fixed point version for celt_pitch_xcorr on aarch64
- [RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics
- [RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics
- [RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics