thr3ads.net - search: "mac16_32

2004 Nov 03

2

speex on TI C5x fixed-point DSP

> One thing I've noticed so far in the filter_mem2 code is the calls to > SATURATE(x, 805306368). 805306368 is 0x30000000. I was expecting that > to be on a bit boundary, say 0x3fffffff? In which case the arithmetic > saturation logic could be used. I don't think it would make that big of a difference, since the saturation is outside of the inner loop. If it's that

speex on TI C5x fixed-point DSP

2004 Nov 04

0

speex on TI C5x fixed-point DSP

...inner loop. If it's that critical, you >could probably remove the saturation completely and just make sure your >signals are scaled properly (i.e. not too close to saturation). > > > I just realized that. I added counters for each of the arithmetic macros and discovered that mac16_32_q15 is the most frequent. I'm not sure I can do much more without understanding the code better. Q15 is Q1.15 format, right? Looking at MAC16_32_Q15(long c, short a, long b) Is b a Q15 represented as Q17.15 so that the implementation does not depend on saturating hardware? If we do have s...

[PATCH 4/8] Arm64 assembly for Celt fixed-point math.

2015 Aug 05

0

[PATCH 4/8] Arm64 assembly for Celt fixed-point math.

...: "%r"(b), "r"(a<<16) + ); + return ((rd >> 32) << 1); +} +#define MULT16_32_Q15(a, b) (MULT16_32_Q15_arm64(a, b)) + + +/** 16x32 multiply, followed by a 15-bit shift right and 32-bit add. + b must fit in 31 bits. + Result fits in 32 bits. */ +#undef MAC16_32_Q15 +#define MAC16_32_Q15(c, a, b) ADD32(c, MULT16_32_Q15(a, b)) + +/** 16x32 multiply, followed by a 16-bit shift right and 32-bit add. + Result fits in 32 bits. */ +#undef MAC16_32_Q16 +#define MAC16_32_Q16(c, a, b) ADD32(c, MULT16_32_Q16(a, b)) + +/** 32x32 multiplication, followed by a 31-bit sh...

[Aarch64 06/11] Add aarch64 assembly for Celt fixed-point math.

2015 Nov 07

0

[Aarch64 06/11] Add aarch64 assembly for Celt fixed-point math.

...: "%r"(b), "r"(a<<16) + ); + return ((rd >> 32) << 1); +} +#define MULT16_32_Q15(a, b) (MULT16_32_Q15_arm64(a, b)) + + +/** 16x32 multiply, followed by a 15-bit shift right and 32-bit add. + b must fit in 31 bits. + Result fits in 32 bits. */ +#undef MAC16_32_Q15 +#define MAC16_32_Q15(c, a, b) ADD32(c, MULT16_32_Q15(a, b)) + +/** 16x32 multiply, followed by a 16-bit shift right and 32-bit add. + Result fits in 32 bits. */ +#undef MAC16_32_Q16 +#define MAC16_32_Q16(c, a, b) ADD32(c, MULT16_32_Q16(a, b)) + +/** 32x32 multiplication, followed by a 31-bit sh...

Blackfin inline assembly for fixed math

2010 Mar 25

0

Blackfin inline assembly for fixed math

..."A1 = %1.L*%2.L (M);\n\t" "A1 = A1 >>> 15;\n\t" "%0 = (A1 += %1.L*%2.H);\n\t" : "=W" (res), "=d" (a), "=d" (b) : "1" (a), "2" (b) : "A1" ); return res; } #undef MAC16_32_Q15 static inline celt_int32 MAC16_32_Q15(celt_int32 c, celt_int16 a, celt_int32 b) { celt_int32 res; __asm__ ( "A1 = %2.L*%1.L (M);\n\t" "A1 = A1 >>> 15;\n\t" "%0 = (A1 += %2.L*%1.H);\n\t" "%0 = %0 + %4;\n\t...

[Patch]01-Add ARM5E macros

2013 May 17

1

[Patch]01-Add ARM5E macros

...16 a, opus_val32 b) +{ + int res; + __asm__( + "smulwb %0, %1, %2;\n" + : "=&r"(res) + : "%r"(b<<1),"r"(a) + ); + return res; +} + + +/** 16x32 multiply-add, followed by a 15-bit shift right. Results fits in 32 bits */ +#undef MAC16_32_Q15 +static inline opus_val32 MAC16_32_Q15(opus_val32 c, opus_val16 a, opus_val32 b) +{ + int res; + __asm__( + "smlawb %0, %1, %2, %3;\n" + : "=&r"(res) + : "%r"(b<<1),"r"(a), "r"(c) + ); + return res; +} + +/** 16x16 m...

[ANNOUNCE] PocketPC Port for speex-1.1.5 with sample code

2004 Aug 06

0

[ANNOUNCE] PocketPC Port for speex-1.1.5 with sample code

...#define MULT16_16(a,b) ((a)*(b)) #define MAC16_16(c,a,b) ((c)+(a)*(b)) #define MULT16_32_Q11(a,b) ((a)*(b)) #define MULT16_32_Q13(a,b) ((a)*(b)) #define MULT16_32_Q14(a,b) ((a)*(b)) #define MULT16_32_Q15(a,b) ((a)*(b)) #define MAC16_32_Q11(c,a,b) ((c)+(a)*(b)) #define MAC16_32_Q15(c,a,b) ((c)+(a)*(b)) #define MAC16_16_Q11(c,a,b) ((c)+(a)*(b)) #define MULT16_16_Q11(a,b) ((a)*(b)) #define MULT16_16_Q13(a,b) ((a)*(b)) #define MULT16_16_Q14(a,b) ((a)*(b)) #define MULT16_16_Q15(a,b) ((a)*(b)) #define DIV32_16(a,b) ((a)/(b)) #define DIV32(a,b) ((a...

[ANNOUNCE] PocketPC Port for speex-1.1.5 with sample code

2004 Aug 06

2

[ANNOUNCE] PocketPC Port for speex-1.1.5 with sample code

Hi Jean-Marc, Based on the wonderful Speex project, I've created SpeexOutLoud, essentially a Speex codec port for Windows Mobile 2003 devices. I've included a sample project intended to show the usage of SpeexOutLoud codec in a Pocket PC application based on .NET Compact Framework. I'd request you to please go through the attached build, and include it as a contribution to the

[PATCH] Blackfin: cleanup astat/cc/hardware loop asm clobbers

2009 Apr 24

2

[PATCH] Blackfin: cleanup astat/cc/hardware loop asm clobbers

...px_word32_t b) "%0 = (A1 += %2.L*%1.H) ;\n\t" : "=&W" (res), "=&d" (b) : "d" (a), "1" (b) - : "A1" + : "A1", "ASTAT" ); return res; } @@ -130,7 +133,7 @@ static inline spx_word32_t MAC16_32_Q15(spx_word32_t c, spx_word16_t a, spx_word "%0 = %0 + %4;\n\t" : "=&W" (res), "=&d" (b) : "d" (a), "1" (b), "d" (c) - : "A1" + : "A1", "ASTAT" ); return res; } @@ -14...

[PATCH 0/8] Patches for arm64 (aarch64) support

2015 Aug 05

8

[PATCH 0/8] Patches for arm64 (aarch64) support

This sequence of patches provides arm64 support for Opus. Tested on iOS, Android, and Ubuntu 14.04. The patch sequence was written on top of Viswanath Puttagunta's Ne10 patches, but all but the second ("Reorganize pitch_arm.h") should, I think, apply independently of it. It does depends on my previous intrinsics configury reorganization, however. Comments welcome. With this and

[Aarch64 00/11] Patches to enable Aarch64 (arm64) optimizations, rebased to current master.

2015 Nov 07

12

[Aarch64 00/11] Patches to enable Aarch64 (arm64) optimizations, rebased to current master.

Here are my aarch64 patches rebased to the current tip of Opus master. They're largely the same as my previous patch set, with the addition of the final one (the Neon fixed-point implementation of xcorr_kernel). This replaces Viswanath's Neon fixed-point celt_pitch_xcorr, since xcorr_kernel is used in celt_fir and celt_iir as well. These have been tested for correctness under qemu

Speex on TI C6x, Problem with TI C5x Patch

2005 May 25

3

Speex on TI C6x, Problem with TI C5x Patch

...32_t)(b)) #define MAC16_16(c,a,b) ((c)+(spx_word32_t)(a)*(spx_word32_t)(b)) #define MULT16_32_Q11(a,b) ((a)*(b)) #define MULT16_32_Q13(a,b) ((a)*(b)) #define MULT16_32_Q14(a,b) ((a)*(b)) #define MULT16_32_Q15(a,b) ((a)*(b)) #define MAC16_32_Q11(c,a,b) ((c)+(a)*(b)) #define MAC16_32_Q15(c,a,b) ((c)+(a)*(b)) #define MAC16_16_Q11(c,a,b) ((c)+(a)*(b)) #define MAC16_16_Q13(c,a,b) ((c)+(a)*(b)) #define MULT16_16_Q11_32(a,b) ((a)*(b)) #define MULT16_16_Q13(a,b) ((a)*(b)) #define MULT16_16_Q14(a,b) ((a)*(b)) #define MULT16_16_Q15(a,b) ((a)*(b)) #define MULT16...

search for: mac16_32_q15