similar to: fixed point version for celt_pitch_xcorr on aarch64

Displaying 20 results from an estimated 1000 matches similar to: "fixed point version for celt_pitch_xcorr on aarch64"

2014 Dec 24
6
[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?
Hi, I am working on DSP module of Ne10. I see there are fixed-point and floating-point FFT inside Opus. Is fixed-point FFT only a fall back for CPU without VFP? On ARMv7-A and ARMv8-A, benchmark result shows that fixed-point (int32) and floating-point (float32) FFT have similar performance. I guess fixed-point version is not often used on these platforms. Is it worth the effort to NEON-optimize
2015 Jan 19
1
[RFC][FFT][Fixed-Point][NEON] NEON-Optimize
Hi Jean-Marc, I have implemented fixed-point FFT with 32-bit twiddles. Now I want to evaluate the accuracy, what method does Opus use? I use function implemented inside Ne10 to calculate SNR. Any comment? | size | SNR (dB) | | 16 | 82.558587 | | 32 | 83.530298 | | 60 | 80.292433 | | 64 | 82.752950 | | 120 | 79.625077 | | 128 | 83.091260 | | 240 | 79.555263 | | 256 |
2014 Dec 29
2
[RFC][FFT][Fixed-Point][NEON] NEON-Optimize
Hi Timothy, It requires some extra effort if twiddles and input/output have different bit width. Since Opus uses int32 for twiddles, we are going to do the same thing. Thanks, Phil Wang -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not
2014 Dec 25
2
[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?
Jean-Marc Valin wrote: > There is definitely some use for a Neon fixed-point FFT. How much > exactly I'm not sure. Fixed-point is a bit more than just a fall-back Well, we use fixed-point mode by default in Firefox for both Firefox OS and Fennec (Firefox on Android). The reason is that, although there is some NEON-class hardware where float does finally appear to be a little bit
2014 Dec 26
0
[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?
Thanks Timothy and Jean-Marc, I will start NEON optimizing fixed-point FFT. Is it int32 good enough? Benchmark data shows that FFT using int16 is much faster than FFT using int32. > -----Original Message----- > From: Timothy B. Terriberry [mailto:tterribe at xiph.org] > Sent: Friday, December 26, 2014 6:52 AM > To: Phil Wang; opus at xiph.org > Cc: Zhongwei Yao; Yang Zhang; Zhou
2013 Sep 10
0
[LLVMdev] Address Space Casting
Hi, | This patch introduces a new IR instruction named 'addrspacecast' that will be | used to represent the casting operation between pointers of different address | spaces. This instruction will represent whatever kind of conversion (potentially | both value and size of the pointer) and the semantic of the conversion between a | pair of address spaces is target specific. Assuming I
2015 May 11
1
opus Digest, Vol 76, Issue 11
Hi Jean-Marc, Thanks for pointing us the way. Yes it is a overflowing problem. I moved all scaling code in the front of any other operations, and test_unit_mdct passes for all sizes. I will update Ne10 right after Vish double checks it on hardware. He will repost patches with more verification later this week. Regards, Phil Wang Well, I see three questions that need to be answered at this point
2014 Dec 18
1
[ARM][FFT][NEON] Integrate Ne10 into Opus?
Hi Ralph, I have pushed patches to enable radix 3 and radix 5. Github: https://github.com/projectNe10/Ne10/releases/tag/v1.2.0 Best Regards, Phil Wang > Date: Thu, 11 Dec 2014 10:46:50 -0800 > From: Ralph Giles <giles at thaumas.net> > Subject: Re: [opus] [ARM][FFT][NEON] Integrate Ne10 into Opus? > To: opus at xiph.org > Message-ID: <5489E69A.5000305 at thaumas.net>
2014 Dec 25
0
[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?
There is definitely some use for a Neon fixed-point FFT. How much exactly I'm not sure. Fixed-point is a bit more than just a fall-back for CPUs with no FPU. There are CPUs for which fixed-point is still faster. It depends on the exact model but also on what you run. For example, even on x86 I believe that SILK encoding is slightly faster in fixed-point, even though CELT is faster in float.
2014 Dec 29
0
[RFC][FFT][Fixed-Point][NEON] NEON-Optimize
On 28/12/14 11:04 PM, Phil Wang wrote: > It requires some extra effort if twiddles and input/output have > different bit width. Since Opus uses int32 for twiddles, we are going > to do the same thing. Actually, the existing Opus code has 16-bit twiddles, mostly because it makes it possible to use smulwb on ARMv5E. That being said, I agree that for Neon it makes sense to use 32-bit
2015 Feb 05
0
[RFC PATCH v2] Encode optimize using libNe10
Hi Viswanath, Great to see it coming. > Phil, > > As you mentioned earlier, could you please address all > compile and linker errors/warnings coming out of Ne10 library? [Phil Wang] OK, I will deliver it. But I will try to add -funsafe-math-optimisations flag to our build system first. Also I will have a look into our build system on Linux. From your previous response, I guess there
2015 Mar 04
0
[RFC PATCHv3] Encode optimize using libNe10
Hi Timothy and Viswanath, > FYI, I got Phil @ ARM to independently verify for any compile/link > warning/errors and he said he did not find any... And since I haven't > heard from you for a week, I went ahead and pushed RFCv3. Yes, I do get it built without compile/link warning/errors. To save some time, please turn off other modules in Ne10 Open $NE10_DIR/CMakeLists.txt and find
2015 Feb 03
2
opus Digest, Vol 72, Issue 17
Hi all, I have already added support for scaled forward non-power-of-2 floating-point FFT: https://github.com/projectNe10/Ne10/commit/79c3d787302f8d74b9bcfe6545d487cdf1b101d9 Two flags are added to cfg structure: is_forward_scaled and is_backward_scaled. By setting is_forward_scaled to anything but zero, ne10_fft_c2c_1d_float32_neon will scale the output. So we can remove need for one buffer on
2014 Mar 31
5
[LLVMdev] Contributing the Apple ARM64 compiler backend
Hi, Apart from whether fast-isel should be enabled or disabled (I think enabled, personally), I haven't heard any dissenting voices about how to attack the merge problem yet. Tim, am I correct in saying that you believe AArch64 -> ARM64 is the right way to go? Does anyone disagree with that approach? Cheers, James ________________________________________ From: llvmdev-bounces at
2015 Feb 04
0
opus Digest, Vol 72, Issue 17
On 3 February 2015 at 01:31, Phil Wang <Phil.Wang at arm.com> wrote: > Hi all, > > I have already added support for scaled forward non-power-of-2 floating-point FFT: > https://github.com/projectNe10/Ne10/commit/79c3d787302f8d74b9bcfe6545d487cdf1b101d9 > > Two flags are added to cfg structure: is_forward_scaled and is_backward_scaled. > By setting is_forward_scaled to
2015 Jan 30
0
fixed point version for celt_pitch_xcorr on aarch64
Zhongwei Yao wrote: > Hi, all, > > Does Opus need celt_pitch_xcorr? s fixed point version for ARM aarch64 > architecture? If yes, which version does Opus prefer: assembly or > instrinsics? It would be nice to have one. I don't have a lot of experience with aarch64 (I still haven't been able to obtain a dev board), so I don't really know how intrinsics compare to
2017 Jul 21
2
[PATCH] Fix celt_pitch_xcorr ARM jump table compiling error
Hi, Attached is a fix related to ARM optimization jump table compiling error. Thanks, Linfeng Zhang -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xiph.org/pipermail/opus/attachments/20170720/661d96b5/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name:
2013 Jan 11
2
[LLVMdev] Documentation of fmuladd intrinsic
The fmuladd intrinsic is described as saying that a multiply and addition sequence can be fused into an fma instruction "if the code generator determines that the fused expression would be legal and efficient". (http://llvm.org/docs/LangRef.html#llvm-fma-intrinsic) I've spent a bit of time puzzling over how a code generator is supposed to know if it's legal to generate an fma
2015 Oct 16
1
[RFC V3 7/8] armv7, armv8: Optimize fixed point fft using NE10 library
Hi Timothy, Sorry for late reply. I have upstreamed the patch to fix the regression here: https://github.com/projectNe10/Ne10/commit/ee5d856cd9cb8c4a15ace567df4239f4e788d043 I have tested it with Vish's branch: http://git.linaro.org/people/viswanath.puttagunta/opus.git/shortlog/refs/heads/rfcv3_fft_fixed) Both unit test dft and unit test mdct passed on ARM v7/v8, floating point/fixed
2014 Nov 24
3
[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics
On 21 November 2014 at 18:06, Timothy B. Terriberry <tterribe at xiph.org> wrote: > > Viswanath Puttagunta wrote: >> >> a. Simplest use case to validate this optimization for correctness. >> b. Simplest use case to validate this optimization for performance. >> >> Would prefer something like opusdec that can be executed on command >> line. > >