thr3ads.net - similar to: "[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?"

Displaying 20 results from an estimated 3000 matches similar to: "[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?"

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize

2015 Jan 19

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize

Hi Jean-Marc, I have implemented fixed-point FFT with 32-bit twiddles. Now I want to evaluate the accuracy, what method does Opus use? I use function implemented inside Ne10 to calculate SNR. Any comment? | size | SNR (dB) | | 16 | 82.558587 | | 32 | 83.530298 | | 60 | 80.292433 | | 64 | 82.752950 | | 120 | 79.625077 | | 128 | 83.091260 | | 240 | 79.555263 | | 256 |

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize

2014 Dec 29

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize

Hi Timothy, It requires some extra effort if twiddles and input/output have different bit width. Since Opus uses int32 for twiddles, we are going to do the same thing. Thanks, Phil Wang -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not

[ARM][FFT][NEON] Integrate Ne10 into Opus?

2014 Dec 18

[ARM][FFT][NEON] Integrate Ne10 into Opus?

Hi Ralph, I have pushed patches to enable radix 3 and radix 5. Github: https://github.com/projectNe10/Ne10/releases/tag/v1.2.0 Best Regards, Phil Wang > Date: Thu, 11 Dec 2014 10:46:50 -0800 > From: Ralph Giles <giles at thaumas.net> > Subject: Re: [opus] [ARM][FFT][NEON] Integrate Ne10 into Opus? > To: opus at xiph.org > Message-ID: <5489E69A.5000305 at thaumas.net>

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?

2014 Dec 25

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?

Jean-Marc Valin wrote: > There is definitely some use for a Neon fixed-point FFT. How much > exactly I'm not sure. Fixed-point is a bit more than just a fall-back Well, we use fixed-point mode by default in Firefox for both Firefox OS and Fennec (Firefox on Android). The reason is that, although there is some NEON-class hardware where float does finally appear to be a little bit

[RFC V3 7/8] armv7, armv8: Optimize fixed point fft using NE10 library

2015 Oct 06

[RFC V3 7/8] armv7, armv8: Optimize fixed point fft using NE10 library

I'm trying to get these cleaned up and landed, but I'm running into some trouble with this patch. Using commit a08b29d88e3c (July 21) of Ne10, I'm seeing test failures for 60-point FFTs: nfft=60 inverse=0,snr = -3.312408 ** poor snr: -3.312408 ** nfft=60 inverse=1,snr = -16.079597 ** poor snr: -16.079597 ** All other sizes tested appear to work fine (84 to 140 dB of SNR). This

fixed point version for celt_pitch_xcorr on aarch64

2015 Jan 27

fixed point version for celt_pitch_xcorr on aarch64

Hi, all, Does Opus need celt_pitch_xcorr' s fixed point version for ARM aarch64 architecture? If yes, which version does Opus prefer: assembly or instrinsics? Thanks, Zhongwei -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the

opus Digest, Vol 72, Issue 17

2015 Feb 03

opus Digest, Vol 72, Issue 17

Hi all, I have already added support for scaled forward non-power-of-2 floating-point FFT: https://github.com/projectNe10/Ne10/commit/79c3d787302f8d74b9bcfe6545d487cdf1b101d9 Two flags are added to cfg structure: is_forward_scaled and is_backward_scaled. By setting is_forward_scaled to anything but zero, ne10_fft_c2c_1d_float32_neon will scale the output. So we can remove need for one buffer on

[ARM][FFT][NEON] Integrate Ne10 into Opus?

2014 Dec 11

[ARM][FFT][NEON] Integrate Ne10 into Opus?

Hi everyone, I am working on Ne10 project. Ne10 provides NEON optimized FFT routines that are much faster (compared to those without NEON), on most ARMv7-A and all ARMv8-A devices. How about integrate it into Opus? I am not familiar with configure script, but I find "Optinal Packages" in it. If we provides --with-ne10-fft option, the one extra thing that users need to do is to

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?

2014 Dec 25

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?

There is definitely some use for a Neon fixed-point FFT. How much exactly I'm not sure. Fixed-point is a bit more than just a fall-back for CPUs with no FPU. There are CPUs for which fixed-point is still faster. It depends on the exact model but also on what you run. For example, even on x86 I believe that SILK encoding is slightly faster in fixed-point, even though CELT is faster in float.

[RFC V3 7/8] armv7, armv8: Optimize fixed point fft using NE10 library

2015 Oct 16

[RFC V3 7/8] armv7, armv8: Optimize fixed point fft using NE10 library

Hi Timothy, Sorry for late reply. I have upstreamed the patch to fix the regression here: https://github.com/projectNe10/Ne10/commit/ee5d856cd9cb8c4a15ace567df4239f4e788d043 I have tested it with Vish's branch: http://git.linaro.org/people/viswanath.puttagunta/opus.git/shortlog/refs/heads/rfcv3_fft_fixed) Both unit test dft and unit test mdct passed on ARM v7/v8, floating point/fixed

opus Digest, Vol 72, Issue 17

2015 Feb 04

opus Digest, Vol 72, Issue 17

Viswanath Puttagunta wrote: > What should we do for power-of-2? I really want to avoid putting > runtime checks if nfft is power of 2 in opus_fft_float_neon. Given the tests that had to be disabled for NE10, I suspect we will not really be able to use it for CUSTOM_MODES, which should be the only time nfft is a power of 2. So I'd suggest just disabling the support when CUSTOM_MODES

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?

2014 Dec 25

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?

> I am working on DSP module of Ne10. I see there are fixed-point and > floating-point FFT inside Opus. Is fixed-point FFT only a fall back for CPU > without VFP? On ARMv7-A and ARMv8-A, benchmark result shows that fixed-point > (int32) and floating-point (float32) FFT have similar performance. I guess > fixed-point version is not often used on these platforms. Is it worth the >

[RFC PATCH v1 0/8] Ne10 fft fixed and previous

2015 Apr 30

[RFC PATCH v1 0/8] Ne10 fft fixed and previous

On 29 April 2015 at 17:22, Timothy B. Terriberry <tterribe at xiph.org> wrote: > > Viswanath Puttagunta wrote: >> >> This patch series is follow up on work I posted on [1]. >> In addition to what was posted on [1], this patch series mainly >> integrates Fixed point FFT implementations in NE10 library into opus. >> You can view my opus wip code at [2]. >

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?

2014 Dec 26

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?

Thanks Timothy and Jean-Marc, I will start NEON optimizing fixed-point FFT. Is it int32 good enough? Benchmark data shows that FFT using int16 is much faster than FFT using int32. > -----Original Message----- > From: Timothy B. Terriberry [mailto:tterribe at xiph.org] > Sent: Friday, December 26, 2014 6:52 AM > To: Phil Wang; opus at xiph.org > Cc: Zhongwei Yao; Yang Zhang; Zhou

opus Digest, Vol 76, Issue 11

2015 May 11

opus Digest, Vol 76, Issue 11

Hi Jean-Marc, Thanks for pointing us the way. Yes it is a overflowing problem. I moved all scaling code in the front of any other operations, and test_unit_mdct passes for all sizes. I will update Ne10 right after Vish double checks it on hardware. He will repost patches with more verification later this week. Regards, Phil Wang Well, I see three questions that need to be answered at this point

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize

2014 Dec 29

[RFC][FFT][Fixed-Point][NEON] NEON-Optimize

On 28/12/14 11:04 PM, Phil Wang wrote: > It requires some extra effort if twiddles and input/output have > different bit width. Since Opus uses int32 for twiddles, we are going > to do the same thing. Actually, the existing Opus code has 16-bit twiddles, mostly because it makes it possible to use smulwb on ARMv5E. That being said, I agree that for Neon it makes sense to use 32-bit

(no subject)

2015 May 08

(no subject)

Hello Jean-Marc, Below are the results that show test_unit_dft passes, but test_unit_mdct fails (only for nfft=480, 960, 1920) Note: Tested on BeagleboneBlack(Cortex-A8) fixed point on branch [1] ./test_unit_dft nfft=32 inverse=0,snr = 88.394372 nfft=32 inverse=1,snr = 93.896470 nfft=128 inverse=0,snr = 89.185895 nfft=128 inverse=1,snr = 93.537021 nfft=256 inverse=0,snr = 88.353151 nfft=256

(no subject)

2015 May 08

(no subject)

Hello Jean-Marc, Yep, that was it.. with your patch, test_unit_mdct passes for all nfft. So, what you do you suggest the next step here is? Regards, Vish On 8 May 2015 at 12:30, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote: > Hi, > > Can you apply this change to the MDCT test and run it again. See if more > (all) sizes pass. Given the results, I strongly suspect an

[LLVMdev] Segfault on AArch64 LNT

2014 Oct 16

[LLVMdev] Segfault on AArch64 LNT

Hi, Have you guys seen this? http://lab.llvm.org:8011/builders/clang-aarch64-lnt/builds/1522 There are a lot of commits in there, and I'm far away from ARM64 hardware for a few days, so if one of you guys could have a look, it'd be great. :) cheers, --renato

[LLVMdev] RFC: Recursive inlining

2015 Feb 18

[LLVMdev] RFC: Recursive inlining

Hi, Apologies for the very late response. We have manually tried the idea with a very simple Fibonacci sequence code. While being very very simple, the recursion cannot be handled by TRE. Because there are two recursive callsites, it also needs to keep some sort of state across iterations of the "while(stack not empty)" loop. We get between 2.5 and 8x slowdowns depending on which

similar to: [RFC][FFT][Fixed-Point][NEON] NEON-Optimize Fixed-Point FFT?