thr3ads.net - similar to: "fixed point version for celt_pitch

Displaying 20 results from an estimated 1000 matches similar to: "fixed point version for celt_pitch_xcorr on aarch64"

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?

2014 Dec 24

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?

Hi, I am working on DSP module of Ne10. I see there are fixed-point and floating-point FFT inside Opus. Is fixed-point FFT only a fall back for CPU without VFP? On ARMv7-A and ARMv8-A, benchmark result shows that fixed-point (int32) and floating-point (float32) FFT have similar performance. I guess fixed-point version is not often used on these platforms. Is it worth the effort to NEON-optimize

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize

2015 Jan 19

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize

Hi Jean-Marc, I have implemented fixed-point FFT with 32-bit twiddles. Now I want to evaluate the accuracy, what method does Opus use? I use function implemented inside Ne10 to calculate SNR. Any comment? | size | SNR (dB) | | 16 | 82.558587 | | 32 | 83.530298 | | 60 | 80.292433 | | 64 | 82.752950 | | 120 | 79.625077 | | 128 | 83.091260 | | 240 | 79.555263 | | 256 |

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize

2014 Dec 29

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize

Hi Timothy, It requires some extra effort if twiddles and input/output have different bit width. Since Opus uses int32 for twiddles, we are going to do the same thing. Thanks, Phil Wang -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?

2014 Dec 25

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?

Jean-Marc Valin wrote: > There is definitely some use for a Neon fixed-point FFT. How much > exactly I'm not sure. Fixed-point is a bit more than just a fall-back Well, we use fixed-point mode by default in Firefox for both Firefox OS and Fennec (Firefox on Android). The reason is that, although there is some NEON-class hardware where float does finally appear to be a little bit

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?

2014 Dec 26

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?

Thanks Timothy and Jean-Marc, I will start NEON optimizing fixed-point FFT. Is it int32 good enough? Benchmark data shows that FFT using int16 is much faster than FFT using int32. > -----Original Message----- > From: Timothy B. Terriberry [mailto:tterribe at xiph.org] > Sent: Friday, December 26, 2014 6:52 AM > To: Phil Wang; opus at xiph.org > Cc: Zhongwei Yao; Yang Zhang; Zhou

[LLVMdev] Address Space Casting

2013 Sep 10

[LLVMdev] Address Space Casting

Hi, | This patch introduces a new IR instruction named 'addrspacecast' that will be | used to represent the casting operation between pointers of different address | spaces. This instruction will represent whatever kind of conversion (potentially | both value and size of the pointer) and the semantic of the conversion between a | pair of address spaces is target specific. Assuming I

opus Digest, Vol 76, Issue 11

2015 May 11

opus Digest, Vol 76, Issue 11

Hi Jean-Marc, Thanks for pointing us the way. Yes it is a overflowing problem. I moved all scaling code in the front of any other operations, and test_unit_mdct passes for all sizes. I will update Ne10 right after Vish double checks it on hardware. He will repost patches with more verification later this week. Regards, Phil Wang Well, I see three questions that need to be answered at this point

[ARM][FFT][NEON] Integrate Ne10 into Opus?

2014 Dec 18

[ARM][FFT][NEON] Integrate Ne10 into Opus?

Hi Ralph, I have pushed patches to enable radix 3 and radix 5. Github: https://github.com/projectNe10/Ne10/releases/tag/v1.2.0 Best Regards, Phil Wang > Date: Thu, 11 Dec 2014 10:46:50 -0800 > From: Ralph Giles <giles at thaumas.net> > Subject: Re: [opus] [ARM][FFT][NEON] Integrate Ne10 into Opus? > To: opus at xiph.org > Message-ID: <5489E69A.5000305 at thaumas.net>

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?

2014 Dec 25

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?

There is definitely some use for a Neon fixed-point FFT. How much exactly I'm not sure. Fixed-point is a bit more than just a fall-back for CPUs with no FPU. There are CPUs for which fixed-point is still faster. It depends on the exact model but also on what you run. For example, even on x86 I believe that SILK encoding is slightly faster in fixed-point, even though CELT is faster in float.

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize

2014 Dec 29

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize

On 28/12/14 11:04 PM, Phil Wang wrote: > It requires some extra effort if twiddles and input/output have > different bit width. Since Opus uses int32 for twiddles, we are going > to do the same thing. Actually, the existing Opus code has 16-bit twiddles, mostly because it makes it possible to use smulwb on ARMv5E. That being said, I agree that for Neon it makes sense to use 32-bit

[RFC PATCH v2] Encode optimize using libNe10

2015 Feb 05

[RFC PATCH v2] Encode optimize using libNe10

Hi Viswanath, Great to see it coming. > Phil, > > As you mentioned earlier, could you please address all > compile and linker errors/warnings coming out of Ne10 library? [Phil Wang] OK, I will deliver it. But I will try to add -funsafe-math-optimisations flag to our build system first. Also I will have a look into our build system on Linux. From your previous response, I guess there

[RFC PATCHv3] Encode optimize using libNe10

2015 Mar 04

[RFC PATCHv3] Encode optimize using libNe10

Hi Timothy and Viswanath, > FYI, I got Phil @ ARM to independently verify for any compile/link > warning/errors and he said he did not find any... And since I haven't > heard from you for a week, I went ahead and pushed RFCv3. Yes, I do get it built without compile/link warning/errors. To save some time, please turn off other modules in Ne10 Open $NE10_DIR/CMakeLists.txt and find

opus Digest, Vol 72, Issue 17

2015 Feb 03

opus Digest, Vol 72, Issue 17

Hi all, I have already added support for scaled forward non-power-of-2 floating-point FFT: https://github.com/projectNe10/Ne10/commit/79c3d787302f8d74b9bcfe6545d487cdf1b101d9 Two flags are added to cfg structure: is_forward_scaled and is_backward_scaled. By setting is_forward_scaled to anything but zero, ne10_fft_c2c_1d_float32_neon will scale the output. So we can remove need for one buffer on

[LLVMdev] Contributing the Apple ARM64 compiler backend

2014 Mar 31

[LLVMdev] Contributing the Apple ARM64 compiler backend

Hi, Apart from whether fast-isel should be enabled or disabled (I think enabled, personally), I haven't heard any dissenting voices about how to attack the merge problem yet. Tim, am I correct in saying that you believe AArch64 -> ARM64 is the right way to go? Does anyone disagree with that approach? Cheers, James ________________________________________ From: llvmdev-bounces at

opus Digest, Vol 72, Issue 17

2015 Feb 04

opus Digest, Vol 72, Issue 17

On 3 February 2015 at 01:31, Phil Wang <Phil.Wang at arm.com> wrote: > Hi all, > > I have already added support for scaled forward non-power-of-2 floating-point FFT: > https://github.com/projectNe10/Ne10/commit/79c3d787302f8d74b9bcfe6545d487cdf1b101d9 > > Two flags are added to cfg structure: is_forward_scaled and is_backward_scaled. > By setting is_forward_scaled to

fixed point version for celt_pitch_xcorr on aarch64

2015 Jan 30

fixed point version for celt_pitch_xcorr on aarch64

Zhongwei Yao wrote: > Hi, all, > > Does Opus need celt_pitch_xcorr? s fixed point version for ARM aarch64 > architecture? If yes, which version does Opus prefer: assembly or > instrinsics? It would be nice to have one. I don't have a lot of experience with aarch64 (I still haven't been able to obtain a dev board), so I don't really know how intrinsics compare to

[PATCH] Fix celt_pitch_xcorr ARM jump table compiling error

2017 Jul 21

[PATCH] Fix celt_pitch_xcorr ARM jump table compiling error

Hi, Attached is a fix related to ARM optimization jump table compiling error. Thanks, Linfeng Zhang -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xiph.org/pipermail/opus/attachments/20170720/661d96b5/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name:

[LLVMdev] Documentation of fmuladd intrinsic

2013 Jan 11

[LLVMdev] Documentation of fmuladd intrinsic

The fmuladd intrinsic is described as saying that a multiply and addition sequence can be fused into an fma instruction "if the code generator determines that the fused expression would be legal and efficient". (http://llvm.org/docs/LangRef.html#llvm-fma-intrinsic) I've spent a bit of time puzzling over how a code generator is supposed to know if it's legal to generate an fma

[RFC V3 7/8] armv7, armv8: Optimize fixed point fft using NE10 library

2015 Oct 16

[RFC V3 7/8] armv7, armv8: Optimize fixed point fft using NE10 library

Hi Timothy, Sorry for late reply. I have upstreamed the patch to fix the regression here: https://github.com/projectNe10/Ne10/commit/ee5d856cd9cb8c4a15ace567df4239f4e788d043 I have tested it with Vish's branch: http://git.linaro.org/people/viswanath.puttagunta/opus.git/shortlog/refs/heads/rfcv3_fft_fixed) Both unit test dft and unit test mdct passed on ARM v7/v8, floating point/fixed

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

2014 Nov 24

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

On 21 November 2014 at 18:06, Timothy B. Terriberry <tterribe at xiph.org> wrote: > > Viswanath Puttagunta wrote: >> >> a. Simplest use case to validate this optimization for correctness. >> b. Simplest use case to validate this optimization for performance. >> >> Would prefer something like opusdec that can be executed on command >> line. > >

similar to: fixed point version for celt_pitch_xcorr on aarch64