Displaying 20 results from an estimated 28 matches for "mac16_16".
2014 Feb 08
3
[PATCH 1/2] arm: Use the UAL syntax for ldr<cc>h instructions
On Fri, 7 Feb 2014, Timothy B. Terriberry wrote:
> Martin Storsjo wrote:
>> This is required in order to build using the built-in assembler
>> in clang.
>
> These patches break the gcc build (with "Error: bad instruction").
Ah, right, sorry about that.
> Documentation I've seen is contradictory on which order ({cond}{size} or
> {size}{cond}) is correct.
2014 Feb 08
0
[PATCH v2] arm: Use the UAL syntax for instructions
...orr_arm.s b/celt/arm/celt_pitch_xcorr_arm.s
index 09917b1..598e45b 100644
--- a/celt/arm/celt_pitch_xcorr_arm.s
+++ b/celt/arm/celt_pitch_xcorr_arm.s
@@ -309,7 +309,7 @@ xcorr_kernel_edsp_process4_done
SUBS r2, r2, #1 ; j--
; Stall
SMLABB r6, r12, r10, r6 ; sum[0] = MAC16_16(sum[0],x,y_0)
- LDRGTH r14, [r4], #2 ; r14 = *x++
+ LDRHGT r14, [r4], #2 ; r14 = *x++
SMLABT r7, r12, r10, r7 ; sum[1] = MAC16_16(sum[1],x,y_1)
SMLABB r8, r12, r11, r8 ; sum[2] = MAC16_16(sum[2],x,y_2)
SMLABT r9, r12, r11, r9 ; sum[3] = MAC16...
2014 Feb 07
3
[PATCH 1/2] arm: Use the UAL syntax for ldr<cc>h instructions
...orr_arm.s b/celt/arm/celt_pitch_xcorr_arm.s
index 09917b1..3c4b950 100644
--- a/celt/arm/celt_pitch_xcorr_arm.s
+++ b/celt/arm/celt_pitch_xcorr_arm.s
@@ -309,7 +309,7 @@ xcorr_kernel_edsp_process4_done
SUBS r2, r2, #1 ; j--
; Stall
SMLABB r6, r12, r10, r6 ; sum[0] = MAC16_16(sum[0],x,y_0)
- LDRGTH r14, [r4], #2 ; r14 = *x++
+ LDRHGT r14, [r4], #2 ; r14 = *x++
SMLABT r7, r12, r10, r7 ; sum[1] = MAC16_16(sum[1],x,y_1)
SMLABB r8, r12, r11, r8 ; sum[2] = MAC16_16(sum[2],x,y_2)
SMLABT r9, r12, r11, r9 ; sum[3] = MAC16...
2006 May 01
2
Re: speex echo cancellation limitations
> I am writing to gain a better understanding of the limitations of speex echo
> cancellation, esp. with respect to the fixed point implementation.
> If these limitations have been documented elsewhere already, please let me
> know!
Nothing officially documented, sorry.
> I observe experimentally that when one or both of the echo or ref data for
> speex_echo_cancel() have
2013 May 21
2
[PATCH] 02-Add CELT filter optimizations
Please ignore my previous mail and patch, there is a new version :).
Patch changes are:
- Use MAC16_16 macros instead of (sum += a*b) and unroll a loop by 2. It
increase performance when using optimized macros (ex: ARMv5E). A
possible side effect of loop unroll is that i don't check for odd length
here.
- Add NEON version of FIR filter and autocorr
- Add a section in autoconf in order to chec...
2016 Sep 13
4
[PATCH 12/15] Replace call of celt_inner_prod_c() (step 1)
Should call celt_inner_prod().
---
celt/bands.c | 7 ++++---
celt/bands.h | 2 +-
celt/celt_encoder.c | 6 +++---
celt/pitch.c | 2 +-
src/opus_multistream_encoder.c | 2 +-
5 files changed, 10 insertions(+), 9 deletions(-)
diff --git a/celt/bands.c b/celt/bands.c
index bbe8a4c..1ab24aa 100644
--- a/celt/bands.c
+++ b/celt/bands.c
2013 May 21
0
[PATCH] 02-
- Use MAC16_16 macros instead of (sum += a*b) and unroll a loop by 2. It
increase performance when using optimized macros (ex: ARMv5E). A
possible side effect of loop unroll is that i don't check for odd length
here.
- Add NEON version of FIR filter and autocorr
--
Aur?lien Zanelli
Parrot SA
174, quai d...
2006 May 02
0
Re: speex echo cancellation limitations
...agnitude +/- 32767
-- 2nd arg is file containing all zeroes
The division by zero appears to be caused by the calculation:
See = inner_prod(st->e+st->frame_size, st->e+st->frame_size, st->frame_size)
which returns negative due to overflow occuring in mdf.c:inner_prod() :
part = MAC16_16(part,*x++,*y++);
part = MAC16_16(part,*x++,*y++);
part = MAC16_16(part,*x++,*y++);
part = MAC16_16(part,*x++,*y++);
sum = ADD32(sum,SHR32(part,6));
This overflow can be avoided by rewriting this as:
part = part + ((*x++ * *y++)>>1);
part = part + ((*x++ * *y...
2016 Jun 17
5
ARM NEON optimization -- celt_fir()
Hi all,
This is Linfeng Zhang from Google. I'll work on ARM NEON optimization in the
next few months.
I'm submitting 2 patches in the following couple of emails, which have the new
created celt_fir_neon().
I revised celt_fir_c() to not pass in argument "mem" in Patch 1. If there are
concerns to this change, please let me know.
Many thanks to your comments.
Linfeng Zhang
2006 May 02
3
Re: speex echo cancellation limitations
..._prod(st->e+st->frame_size, st->e+st->frame_size, st->frame_size)
Does that also happen with "real life" signals or just high-amplitude
sinusoids (probably worth fixing anyway).
> which returns negative due to overflow occuring in mdf.c:inner_prod() :
> part = MAC16_16(part,*x++,*y++);
> part = MAC16_16(part,*x++,*y++);
> part = MAC16_16(part,*x++,*y++);
> part = MAC16_16(part,*x++,*y++);
> sum = ADD32(sum,SHR32(part,6));
> This overflow can be avoided by rewriting this as:
> part = part + ((*x++ * *y++)>>1);
&...
2007 Aug 29
2
high-pass filter issues
...num;
if (filtID>4)
filtID=4;
den = Pcoef[filtID]; num = Zcoef[filtID];
/*return;*/
for (i=0;i<len;i++)
{
spx_word16_t yi;
spx_word32_t vout = ADD32(MULT16_16(num[0], x[i]),mem[0]);
yi = EXTRACT16(SATURATE(PSHR32(vout,14),32767));
mem[0] = ADD32(MAC16_16(mem[1], num[1],x[i]),
SHL32(MULT16_32_Q15(-den[1],vout),1));
mem[1] = ADD32(MULT16_16(num[2],x[i]),
SHL32(MULT16_32_Q15(-den[2],vout),1));
y[i] = yi;
}
}
I can step into the function just fine, but when I run it, even just
from the initial variable declarations to the top of that...
2013 May 17
1
[Patch]01-Add ARM5E macros
...us_val16 a, opus_val32 b)
+{
+ int res;
+ __asm__(
+ "smlawb %0, %1, %2, %3;\n"
+ : "=&r"(res)
+ : "%r"(b<<1),"r"(a), "r"(c)
+ );
+ return res;
+}
+
+/** 16x16 multiply-add where the result fits in 32 bits */
+#undef MAC16_16
+static inline opus_val32 MAC16_16(opus_val32 c, opus_val16 a, opus_val16 b)
+{
+ __asm__(
+ "smlabb %0, %1, %2, %0;\n"
+ : "=&r"(c), "=r"(a), "=r"(b)
+ : "0"(c), "1"(a), "2"(b)
+ );
+ return c;
+}
+
+/*...
2005 Dec 09
1
[PATCH] compute_weighted_codebook a little bit faster
Hi,
here is a patch making the function compute_weighted_codebook a little
bit faster. Not so impressive but avoid a loop and is really faster on
small platforms like the MIPS I'm working on.
Enjoy,
Matthieu Poullet
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cwc_patch
Type: application/octet-stream
Size: 1226 bytes
Desc: not available
Url :
2004 Aug 06
1
fixed point macros
...ally and are defined to short (16 bits) and int (32 bits)
> for fixed-point. As for the macros, here are some of them (the rest
> should be easy to guess):
>
> ADD16, ADD32 adders for 16 and 32 bits
> MULT16_16 multiply a 16 bit value by another 16 bit value (result in
> 32)
> MAC16_16 same but also adds to the first argument
> MULT16_16_Q15 multiply a 16 bit value by another 16 bit value and shift
> right by 15 (result assumed to fit in 16 bits)
>
> Note that all these functions DO NOT perform saturation, so you need to
> make sure that the operations can't po...
2009 Jan 14
0
[PATCH] Pitch now quantised at the band level, got rid of all the VQ code.
...- for (i=0;i<entries;i++)
- {
- celt_word32_t dist=0;
- const celt_pgain_t *inp = in;
- j=0; do {
- celt_pgain_t tmp1 = SUB16(*inp++,PGAIN_EVEN14(codebook, ind));
- celt_pgain_t tmp2 = SUB16(*inp++,PGAIN_ODD14(codebook, ind));
- ind++;
- dist = MAC16_16(dist, tmp1, tmp1);
- dist = MAC16_16(dist, tmp2, tmp2);
- } while (++j<len>>1);
- if (dist<min_dist)
- {
- min_dist=dist;
- best_index=i;
- }
- }
- return best_index;
-}
-
-int quant_pitch(celt_pgain_t *gains, int len)
-{
- int i, id;
-...
2004 Aug 06
0
[ANNOUNCE] PocketPC Port for speex-1.1.5 with sample code
...) (a)
#define SHL(a,shift) (a)
#define SATURATE(x,a) (x)
#define ADD16(a,b) ((a)+(b))
#define SUB16(a,b) ((a)-(b))
#define ADD32(a,b) ((a)+(b))
#define SUB32(a,b) ((a)-(b))
#define ADD64(a,b) ((a)+(b))
#define MULT16_16_16(a,b) ((a)*(b))
#define MULT16_16(a,b) ((a)*(b))
#define MAC16_16(c,a,b) ((c)+(a)*(b))
#define MULT16_32_Q11(a,b) ((a)*(b))
#define MULT16_32_Q13(a,b) ((a)*(b))
#define MULT16_32_Q14(a,b) ((a)*(b))
#define MULT16_32_Q15(a,b) ((a)*(b))
#define MAC16_32_Q11(c,a,b) ((c)+(a)*(b))
#define MAC16_32_Q15(c,a,b) ((c)+(a)*(b))
#define MAC16_1...
2015 Mar 13
1
[RFC PATCH v3] Intrinsics/RTCD related fixes. Mostly x86.
...));
+ xsum1 = _mm_add_ss(xsum1, _mm_shuffle_ps(xsum1, xsum1, 0x55));
+ _mm_store_ss(xy1, xsum1);
+ xsum2 = _mm_add_ps(xsum2, _mm_movehl_ps(xsum2, xsum2));
+ xsum2 = _mm_add_ss(xsum2, _mm_shuffle_ps(xsum2, xsum2, 0x55));
+ _mm_store_ss(xy2, xsum2);
+ for (;i<N;i++)
+ {
+ *xy1 = MAC16_16(*xy1, x[i], y01[i]);
+ *xy2 = MAC16_16(*xy2, x[i], y02[i]);
+ }
}
-#endif
-#if defined(OPUS_X86_MAY_HAVE_SSE2)
-opus_val32 celt_inner_prod_sse2(const opus_val16 *x, const opus_val16 *y,
+opus_val32 celt_inner_prod_sse(const opus_val16 *x, const opus_val16 *y,
int N)
{
- opus_in...
2015 Mar 12
1
[RFC PATCHv2] Intrinsics/RTCD related fixes. Mostly x86.
...));
+ xsum1 = _mm_add_ss(xsum1, _mm_shuffle_ps(xsum1, xsum1, 0x55));
+ _mm_store_ss(xy1, xsum1);
+ xsum2 = _mm_add_ps(xsum2, _mm_movehl_ps(xsum2, xsum2));
+ xsum2 = _mm_add_ss(xsum2, _mm_shuffle_ps(xsum2, xsum2, 0x55));
+ _mm_store_ss(xy2, xsum2);
+ for (;i<N;i++)
+ {
+ *xy1 = MAC16_16(*xy1, x[i], y01[i]);
+ *xy2 = MAC16_16(*xy2, x[i], y02[i]);
+ }
}
-#endif
-#if defined(OPUS_X86_MAY_HAVE_SSE2)
-opus_val32 celt_inner_prod_sse2(const opus_val16 *x, const opus_val16 *y,
+opus_val32 celt_inner_prod_sse(const opus_val16 *x, const opus_val16 *y,
int N)
{
- opus_in...
2004 Aug 06
2
[ANNOUNCE] PocketPC Port for speex-1.1.5 with sample code
Hi Jean-Marc,
Based on the wonderful Speex project, I've created SpeexOutLoud, essentially a Speex codec port for Windows Mobile 2003 devices.
I've included a sample project intended to show the usage of SpeexOutLoud codec in a Pocket PC application based on .NET Compact Framework.
I'd request you to please go through the attached build, and include it as a contribution to the
2005 Mar 02
7
Speex for TI 5509 DSP
I saw a thread in the list archives about a speex port to TI 55x DSP.
Wondering how that worked out (is working out)?
Also wondering if there is a source archive for it,
or if the patch in the email archives is still current, or if there's been
updates.
Any info appreciated.
Thanks
Paul