thr3ads.net - search: "filter

Displaying 20 results from an estimated 29 matches for "filter_mem2".

fir_mem16,iir_mem16 and filter_mem16 optimisations

2008 Aug 02

fir_mem16,iir_mem16 and filter_mem16 optimisations

...; > I going to implement this functions in assembler, but it is hard to do without full understanding how functions work. > > These are direct-form II transposed filters. There's actually two ways > to compute them. For an alternate way, have a look at the commented > version of filter_mem2() completely at the bottom of filters_bfin.h. The > only thing you won't need from that one are the shifts by SIG_SHIFT > because the inputs and outputs of filter_mem16() are already 16 bits. > > Jean-Marc > I still in doubt. You could tell in more detail about filter_mem16()...

cpu utilization across speex versions

2005 May 29

cpu utilization across speex versions

...trying to optimize this 0.02%! :) Anyway, here are the results with just the top functions listed. Note that Speex distributes its CPU load across many functions... Speex 1.0.1: With preprocessor: (overall 2.51%) 0.43% split_cb_search_shape_sign 0.26% speex_preprocess 0.26% fir_mem_up 0.23% filter_mem2 0.23% vq_nbest 0.18% open_loop_nbest_pitch 0.12% qmf_decomp No preprocessor: (overall 2.12%) 0.43% split_cb_search_shape_sign 0.23% fir_mem_up 0.22% filter_mem2 0.21% vq_nbest 0.18% open_loop_nbest_pitch 0.11% qmf_decomp Speex 1.0.7: With preprocessor: (overall 2.33%) 0.28% speex_preprocess 0.25...

fir_mem16,iir_mem16 and filter_mem16 optimisations

2008 Aug 02

fir_mem16,iir_mem16 and filter_mem16 optimisations

Hi! I have some questions about that functions: fir_mem16, iir_mem16 and filter_mem16. Filtering is very slow on TI DSP, and i want to optimise it. Can somebody give me formulas which discribe work of this filters? Or any suggestions about how to transform code for better performance. I going to implement this functions in assembler, but it is hard to do without full understanding how functions

ARM4 filter code

2005 Dec 06

ARM4 filter code

I have found that filter_mem2 fixed point does not match the inlined assembly version for arm4. Looking closer there appears to be an off by one error. It occurs when setting the value of mem at the end of the inner loop. In the C fixed point version this is done with a subtract. In the arm4 version instead of multiplyi...

gcc-4.1: svn 10958 fix point build fails

2006 Feb 25

gcc-4.1: svn 10958 fix point build fails

...O3 -msse -MT filters.lo -MD -MP -MF .deps/filters.Tpo -c filters.c -fPIC -DPIC -o .libs/filters.o cc1: warning: command line option "-fvisibility-inlines-hidden" is valid for C++/ObjC++ but not for C In file included from filters.c:45: filters_sse.h:135: error: conflicting types for 'filter_mem2' filters.h:62: error: previous declaration of 'filter_mem2' was here filters_sse.h:234: error: conflicting types for 'iir_mem2' filters.h:64: error: previous declaration of 'iir_mem2' was here filters_sse.h:331: error: conflicting types for 'fir_mem2' filters.h:6...

speex on TI C5x fixed-point DSP

2004 Nov 03

speex on TI C5x fixed-point DSP

> One thing I've noticed so far in the filter_mem2 code is the calls to > SATURATE(x, 805306368). 805306368 is 0x30000000. I was expecting that > to be on a bit boundary, say 0x3fffffff? In which case the arithmetic > saturation logic could be used. I don't think it would make that big of a difference, since the saturation is ou...

gcc-4.1: svn 10958 fix point build fails

2006 Mar 01

gcc-4.1: svn 10958 fix point build fails

...MD -MP -MF .deps/filters.Tpo -c filters.c -fPIC -DPIC -o > .libs/filters.o > cc1: warning: command line option "-fvisibility-inlines-hidden" is > valid for C++/ObjC++ but not for C > In file included from filters.c:45: > filters_sse.h:135: error: conflicting types for 'filter_mem2' > filters.h:62: error: previous declaration of 'filter_mem2' was here > filters_sse.h:234: error: conflicting types for 'iir_mem2' > filters.h:64: error: previous declaration of 'iir_mem2' was here > filters_sse.h:331: error: conflicting types for 'fir_m...

how to study the speex source code

2006 May 25

how to study the speex source code

...ay you tell me the detail algorithm or the more detail notation of the source code of this two file.including below functions void open_loop_nbest_pitch(float *sw, int start, int end, int len, int *pitch, float *gain, int N, char *stack); float pitch_gain_search_3tap( int forced_pitch_quant( void filter_mem2(float *x, float *num, float *den, float *y, int N, int ord, float *mem) void iir_mem2(float *x, float *den, float *y, int N, int ord, float *mem) void fir_mem2(float *x, float *num, float *y, int N, int ord, float *mem) void syn_percep_zero(float *xx, float *ak, float *awk1, float *awk2, float *y,...

speex on TI C5x fixed-point DSP

2004 Nov 01

speex on TI C5x fixed-point DSP

Jean-Marc Valin wrote: >>I have the encoder and decoder running now and have verified that the >>encoder is bit-exact wrt to the fixed-point code running on x86 for the >>same 30-second audio sample. Encode and decode together run in >>real-time for 8KHz data, complexity=3, on 120MHz C5509 when code and >>data are all in on-chip SRAM. I have not tested the

TI 6xxx platform performance

2006 Jan 18

TI 6xxx platform performance

...er_prod() and normalize16() and I'm confident I can get 32 channels by optimizing 5 or 6 functions. I expect these numbers to translate over the DM642. Symbol Name Count cycle.Total: Incl. cycle.Total:Excl. compute_weighted_codebook 200 4511420 4511420 iir_mem2 599 3338308 3338308 filter_mem2 799 2323655 2323655 compute_impulse_response 200 1800518 1800518 pitch_gain_search_3tap 199 4726604 1744952 open_loop_nbest_pitch 199 4204121 1641016 vq_nbest 800 1626252 1626252 lpc_to_lsp 50 1612650 1558133 nb_encode 50 27412845 1179551 fir_mem2 50 1097300 1097300 i...

Coredumps when --enable-sse is selected

2004 Aug 06

Coredumps when --enable-sse is selected

...3.2, gcc-3.2.3 (weird palindrome there), on a Williamette core Pentium 4 (1.6Ghz) system. I've tried both speex 1.1.5 release, and the current CVS (which self-IDs as 1.1.4), and the result is the same. I suspect some funk in the use of the SSE intrinsics macros. Backtrace: #0 0x40024594 in filter_mem2_10 (x=0x805f31c, _num=0x8061fb8, _den=0x8061fe4, y=0x806071c, N=160, ord=10, _mem=0x8062150) at xmmintrin.h:790 #1 0x400248b4 in filter_mem2 (x=0x805f31c, _num=0x8061fb8, _den=0x8061fe4, y=0x806071c, N=1, ord=0, _mem=0x8061fe4) at filters_sse.h:135 #2 0x40019d1e in nb_encode (state=0x80...

[PATCH] Make SSE Run Time option.

2004 Aug 06

[PATCH] Make SSE Run Time option.

...t is about 5% slower than the pure asm approach, so it's not too bad (SSE asm is 2x faster than x87). Note that unlike the previous version which had a kludge to work with order 8 (required for wideband), this version only works with order 10, so it will only work for narrowband. <p>void filter_mem2(float *x, float *_num, float *_den, float *y, int N, int ord, float *_mem) { __m128 num[3], den[3], mem[3]; int i; /* Copy numerator, denominator and memory to aligned xmm */ for (i=0;i<2;i++) { mem[i] = _mm_loadu_ps(_mem+4*i); num[i] = _mm_loadu_ps(_num+4*i+1);...

Errors in speex lib with Blackfin

2006 Jan 18

Errors in speex lib with Blackfin

Hello! I'v downloaded speex lib 1.1.11.1. I am trying to port speex lib to Blackfin processor. I am using VisualDSP++ 4.0. If I am compiling source codes with using floating point everything ok. When I am compiling with FIXED_POINT defined everything's ok and code works about two times faster. But when I am defining BFIN_ASM I am getting several compiling errors in Blackfin assembler

Coredumps when --enable-sse is selected

2004 Aug 06

Coredumps when --enable-sse is selected

...Williamette core Pentium 4 (1.6Ghz) system. > > I've tried both speex 1.1.5 release, and the current CVS (which self-IDs as > 1.1.4), and the result is the same. > > I suspect some funk in the use of the SSE intrinsics macros. > > Backtrace: > > #0 0x40024594 in filter_mem2_10 (x=0x805f31c, _num=0x8061fb8, > _den=0x8061fe4, y=0x806071c, N=160, ord=10, > _mem=0x8062150) at xmmintrin.h:790 > #1 0x400248b4 in filter_mem2 (x=0x805f31c, _num=0x8061fb8, _den=0x8061fe4, > y=0x806071c, N=1, ord=0, > _mem=0x8061fe4) at filters_sse.h:135 > #2 0x400...

How can I optimize speex under SH

2004 Aug 23

How can I optimize speex under SH

Hi, I have just begin to look into Speex.Speex encodes and decodes fine under SH platform (96MHz). But encoding is taking lot of time i,e for 20ms frame it is taking ~890ms to encode it. Which is very high [To encode 96KB data it takes around 4.5min.] Whether is this architecture suitable for speex ? Whether writing some assembly instructions and some optimization will bring down the

speex on TI C5x fixed-point DSP

2004 Nov 03

speex on TI C5x fixed-point DSP

...eeping the MSBs), which is what the >smulwb does on ARM. If that's the case, you can gain a lot of speed (use >one instruction for 16x32 instead of three). Otherwise, replacing the >32x32 multiplies by 16x16 is probably a good thing. > > One thing I've noticed so far in the filter_mem2 code is the calls to SATURATE(x, 805306368). 805306368 is 0x30000000. I was expecting that to be on a bit boundary, say 0x3fffffff? In which case the arithmetic saturation logic could be used. Jamey

speex on TI C5x fixed-point DSP

2004 Nov 04

speex on TI C5x fixed-point DSP

Jean-Marc Valin wrote: >>One thing I've noticed so far in the filter_mem2 code is the calls to >>SATURATE(x, 805306368). 805306368 is 0x30000000. I was expecting that >>to be on a bit boundary, say 0x3fffffff? In which case the arithmetic >>saturation logic could be used. >> >> > >I don't think it would make that big of a...

Re: sigsegv in _mm_load_ups (linux/gcc 3.x)

2006 Jan 05

Re: sigsegv in _mm_load_ups (linux/gcc 3.x)

...g the real problem. Can you get a debugger and see exactly what assembly statement is causing the crash and what the operands are? Jean-Marc > So, has anyone else seen this issue? > > I am working off svn- the crash is always in the same spot, > in the decoder, in nb_celp, in both filter_mem2 > (if st->lpc_enh_enabled == 1) and iir_mem2 (if == 0) > > The function in question is filter_mem2_10 or > iir_mem2_10 > > _mm_loadu_ps is an unaligned load and all of the > data seems to be ok, and no sigill- get a sigsegv. > > Same code works fine on windoze. CPU...

Speex optimization and 12 bits conversion for 12 bits ADC

2007 Jul 25

Speex optimization and 12 bits conversion for 12 bits ADC

...cally, if either of these apply to your CPU: 1) A store is more costly than a load (e.g. you have write-through cache) 2) You have a MAC instruction where it's complicated to load the accumulator every time then you should consider implementing filter_mem16 similarly to the commented version of filter_mem2() at the end of filters_bfin.h. >> > Another question, my ADC and DAC are 12 bits, but Speex codec is >> > 16bits, Did someone try to modify speex to 12 bits? I think if I >> > modify speex to 12 bits, the computation power will be reduced, is it >> > right? &gt...

SmartPhone ARM

2004 Aug 06

SmartPhone ARM

>What frequency is the ARM processor? The phone shows ARM720 no freq. I'm going to have to guess around 100 Mhz. I ran the same code on an XSCALE ARM 400 mhz. Toshiba e740. Runs about .33 -> .4x realtime. This is the using the generic fixed point defines. Around 5 times faster than I am seeing with the Orange SPV e100. I am using the 1.1.3 codebase. Thanks for taking the time to

search for: filter_mem2