thr3ads.net - similar to: "about optimization"

Displaying 20 results from an estimated 10000 matches similar to: "about optimization"

2017 Apr 12

CELT CFFT size configuration

Dear all, Sorry for this simple and maybe stupid question, I'm working in the implementation of opus for ARMv7e microcontroller using a library CMSIS/DSP used to calculate the CFFT and MDCT based on the DCT-IV. my question is: In the implementation of Celt you have used a fixed length CFFT equal to 1920, I want to know if there is some issues which can appear if a change that configuration

[RFC PATCH v1 0/8] Ne10 fft fixed and previous

2015 Apr 30

[RFC PATCH v1 0/8] Ne10 fft fixed and previous

On 29 April 2015 at 17:22, Timothy B. Terriberry <tterribe at xiph.org> wrote: > > Viswanath Puttagunta wrote: >> >> This patch series is follow up on work I posted on [1]. >> In addition to what was posted on [1], this patch series mainly >> integrates Fixed point FFT implementations in NE10 library into opus. >> You can view my opus wip code at [2]. >

about optimization

2014 Feb 14

about optimization

On 2014-02-13 5:03 PM, Xiangming Zhu (xiangzhu) wrote: > We have some optimizations for x86-64 and SSE enabled CPU since encoding > Opus stream on server side is time sensitive. > > The optimization is based on opus-1.0.3, and after 1.1 is released we > have to merge the changes. > > > > The question is can we share back the optimizations to Opus repository? >

(no subject)

2015 May 08

(no subject)

Hello Jean-Marc, Below are the results that show test_unit_dft passes, but test_unit_mdct fails (only for nfft=480, 960, 1920) Note: Tested on BeagleboneBlack(Cortex-A8) fixed point on branch [1] ./test_unit_dft nfft=32 inverse=0,snr = 88.394372 nfft=32 inverse=1,snr = 93.896470 nfft=128 inverse=0,snr = 89.185895 nfft=128 inverse=1,snr = 93.537021 nfft=256 inverse=0,snr = 88.353151 nfft=256

Make check failure on clone from 31 January

2014 Feb 21

Make check failure on clone from 31 January

I tracked down the bug to an incorrect use of restrict. I would not consider this a compiler bug: we are lying to the optimizer by telling it that a pointer is restrict when in fact it isn't. This can be fixed like so: diff --git a/celt/mdct.c b/celt/mdct.c index 1634e8e..fa5098c 100644 --- a/celt/mdct.c +++ b/celt/mdct.c @@ -276,8 +276,8 @@ void clt_mdct_backward(const mdct_lookup *l,

(no subject)

2015 May 08

(no subject)

Hello Jean-Marc, Yep, that was it.. with your patch, test_unit_mdct passes for all nfft. So, what you do you suggest the next step here is? Regards, Vish On 8 May 2015 at 12:30, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote: > Hi, > > Can you apply this change to the MDCT test and run it again. See if more > (all) sizes pass. Given the results, I strongly suspect an

[RFC PATCH v2] Encode optimize using libNe10

2015 Feb 04

[RFC PATCH v2] Encode optimize using libNe10

Changes from RFC PATCH v1: - passing arch parameter explicitly - reduced stack usage by ~3.5K by using scaled NE10 fft version - moved all optimization array functions to arm_celt_map.c - Other cleanups pointed out by Timothy Phil, As you mentioned earlier, could you please address all compile and linker errors/warnings coming out of Ne10 library? You can find my working Ne10 repo at [1] You

Make check failure on clone from 31 January

2014 Feb 24

Make check failure on clone from 31 January

After a few experiments, I found that both alternatives are very similar, and 2~5% slower compared to the following: diff --git a/celt/mdct.c b/celt/mdct.c index 1634e8e..e490c3b 100644 --- a/celt/mdct.c +++ b/celt/mdct.c @@ -277,7 +277,7 @@ void clt_mdct_backward(const mdct_lookup *l, kiss_fft_scalar *in, kiss_fft_scala it in-place. */ { kiss_fft_scalar * OPUS_RESTRICT yp0 =

[RFC PATCH v4] Enable optimize using libNe10

2015 Mar 03

[RFC PATCH v4] Enable optimize using libNe10

Changes from RFC PATCH v3 - Just rebased on tip - For all else, please see notes from RFC PATCH v3 at http://lists.xiph.org/pipermail/opus/2015-March/002902.html - latest wip opus tree/branch https://git.linaro.org/people/viswanath.puttagunta/Ne10.git branch: rfcv4_final_fft_ne10 Viswanath Puttagunta (1): armv7(float): Optimize encode usecase using NE10 library Makefile.am

[RFC PATCHv3] Encode optimize using libNe10

2015 Mar 03

[RFC PATCHv3] Encode optimize using libNe10

Changes from RFC PATCH v2 - fixed compile issue when just compiling for --enable-intrinsics for ARMv7 without NE10 - Notes for NE10: - All compile/link warnings are now in upstream NE10 - Only patch pending upstream in NE10 is the one that needs to add -funsafe-math-optimizations for ARMv7 targets. - Phil Wang @ ARM is working on getting this fixed. - Note that even without

Make check failure on clone from 31 January

2014 Feb 05

Make check failure on clone from 31 January

Hi, Apologies if this is a known issue, but running make on revision e3187444692195957eb66989622c7b1ad8448b06 fails one of the tests when using fixed point configuration (floating point is ok) on my linux x86. Note that libopus1.1, as extracted from the tar ball, is OK. Specifically, the tests that fail are in celt/tests/test_unit_mdct: nfft=32 inverse=0,snr = 85.341197 nfft=32 inverse=1,snr =

[RFC PATCH v1 0/8] Ne10 fft fixed and previous

2015 Apr 28

[RFC PATCH v1 0/8] Ne10 fft fixed and previous

Hello Timothy / Jean-Marc / opus-dev, This patch series is follow up on work I posted on [1]. In addition to what was posted on [1], this patch series mainly integrates Fixed point FFT implementations in NE10 library into opus. You can view my opus wip code at [2]. Note that while I found some issues both with the NE10 library(fixed fft) and with Linaro toolchain (armv8 intrinsics), the work

[RFC PATCH v1 0/2] Encode optimize using libNE10

2015 Jan 20

[RFC PATCH v1 0/2] Encode optimize using libNE10

Hello opus-dev, I've been cooking up this patchset to integrate NE10 library into opus. Current patchset focuses on encode use case mainly effecting performance of clt_mdct_forward() and opus_fft() (for float only) Glad to report the following on Encode use case: (Measured on my Beaglebone Black Cortex-A8 board) - Performance improvement for encode use case ~= 12.34% (Based on time -p

Butterflies in mdct.c

2005 Jan 26

Butterflies in mdct.c

In mdct.c there's some functions including some-point butterfly. In 32-point and 16-point there are calling of smaller-point function everytime twice on each half of data. When I looked on it I found that's just linear algebra. So it can be rewritten to matrix multiplication. Some one can say: there's optimization on in register working. But imagine there's one calling 32-point,

[RFC PATCH v1] Decode(float) optimize using libNe10

2015 Mar 04

[RFC PATCH v1] Decode(float) optimize using libNe10

Hello All, I extended the libNE10 optimizations for float towards mdct_backwards/opus_ifft. I am able to get about 14.26% improvement for Decode use case now on my Beaglebone Black. Please see [1] for measurements. Questions 1. Since this patch needs to go in after Encode [2] patch) should I submit this as patch series? 2. Since Jonathan Lennox posted intrinsics cleanup [3] patch, should

similar to: about optimization