search for: fixed_generic

Displaying 20 results from an estimated 51 matches for "fixed_generic".

2015 Nov 16
0
[Fast Int64 2/4] Add OPUS_FAST_INT64 flavors of celt/fixed_generic.h macros.
--- celt/fixed_generic.h | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/celt/fixed_generic.h b/celt/fixed_generic.h index ac67d37..1cfd6d6 100644 --- a/celt/fixed_generic.h +++ b/celt/fixed_generic.h @@ -37,16 +37,32 @@ #define MULT16_16SU(a,b) ((opus_val32)(opus_val16)(a)*(opus_val32)(opus_uint16...
2015 Nov 13
2
[Aarch64 00/11] Patches to enable Aarch64
...bring this up again, and I don't want to beat a dead horse, but I was very surprised by your benchmarks so I took a little closer look. I think what's happening is that it's a little unfair to compare the ARM64 inline assembly to the C code, because looking at the C macros in "fixed_generic.h" for MULT16_32_Q16 and MULT16_32_Q15 you find they are implemented with two multiplies, two shifts, an AND and an ADD. It's not hard for me to believe that your inline assembly is faster than that mess. But on a 64-bit machine, there's no reason to go through all that when a simp...
2005 May 25
2
Speex on TI C6x, Problem with TI C5x Patch
...at the expense of a ~30% increase in MIPs. Now the male.wav file through >> encoder/decoder produces a bit-exact match with the C64x test that I did >> earlier. I will do some more testing to isolate the, but it may be a few >> days before I get to this task. As Jean-Marc says, fixed_generic should >> work, unless the compiler becomes hopelessly confused by something. >> Maybe >> this is a compiler bug. > > It's odd that it "almost" works with the fixed_generic.h. The easiest > thing would be to gradually replace routines and see which one caus...
2015 Nov 13
2
[Aarch64 00/11] Patches to enable Aarch64
...again, and I don't want to beat a dead horse, but I was very surprised by your benchmarks so I took a little closer look. >> >> I think what's happening is that it's a little unfair to compare the ARM64 inline assembly to the C code, because looking at the C macros in "fixed_generic.h" for MULT16_32_Q16 and MULT16_32_Q15 you find they are implemented with two multiplies, two shifts, an AND and an ADD. It's not hard for me to believe that your inline assembly is faster than that mess. But on a 64-bit machine, there's no reason to go through all that when a simple 6...
2005 May 26
2
Speex on TI C6x, Problem with TI C5x Patch
Jean-Marc, >> > It's odd that it "almost" works with the fixed_generic.h. The easiest >> > thing would be to gradually replace routines and see which one causes >> > problem. It's most likely (though I'm not 100% sure) that somewhere in >> > the code, I have a 16-bit value that gets sent to a function/macro that >> > expects...
2015 Nov 13
0
[Aarch64 00/11] Patches to enable Aarch64
...this up again, and I don't want to beat a dead horse, but I was very surprised by your benchmarks so I took a little closer look. > > I think what's happening is that it's a little unfair to compare the ARM64 inline assembly to the C code, because looking at the C macros in "fixed_generic.h" for MULT16_32_Q16 and MULT16_32_Q15 you find they are implemented with two multiplies, two shifts, an AND and an ADD. It's not hard for me to believe that your inline assembly is faster than that mess. But on a 64-bit machine, there's no reason to go through all that when a simple 6...
2015 Nov 16
0
[Aarch64 00/11] Patches to enable Aarch64
...to bring this up again, and I don't want to beat a dead horse, but I was very surprised by your benchmarks so I took a little closer look. I think what's happening is that it's a little unfair to compare the ARM64 inline assembly to the C code, because looking at the C macros in "fixed_generic.h" for MULT16_32_Q16 and MULT16_32_Q15 you find they are implemented with two multiplies, two shifts, an AND and an ADD. It's not hard for me to believe that your inline assembly is faster than that mess. But on a 64-bit machine, there's no reason to go through all that when a simple 6...
2005 May 25
2
Speex on TI C6x, Problem with TI C5x Patch
...;fixed_xx.h" header file. I don't know why, and >> haven't >> had time to investigate, but there is a definite improvement when I use >> the >> attached fixed_c55x.h file which has turned all the maths into inline >> functions. > > Did you try with fixed_generic.h or just with fixed_debug.h? > fixed_debug.h uses int and short directly, so I know it won't work with > the C5x. However, I think fixed_generic.h should work and has all the > operators defined as macros anyway, so inlining isn't a problem. I incorporated Stuarts fixed_c55x.h fi...
2007 Dec 12
1
4kbps sounds robotic on TMS320C64
Tried your fixed_generic.h change but that didn't help. Andy ----- Original Message ---- From: Andy Ngo <ndno72-speex@yahoo.com> To: Jean-Marc Valin <jean-marc.valin@usherbrooke.ca> Cc: speex-dev@xiph.org Sent: Wednesday, December 12, 2007 4:13:35 PM Subject: Re: [Speex-dev] 4kbps sounds robotic on TMS32...
2013 Jul 24
1
QCONST16 cross compile inconsistency
...define QCONST32(x,bits) ((opus_val32)(.5+(x)*(((opus_val32)1)<<(bits)))) - #define VERIFY_SHORT(x) ((x)<=32767&&(x)>=-32768) #define VERIFY_INT(x) ((x)<=2147483647LL&&(x)>=-2147483648LL) #define VERIFY_UINT(x) ((x)<=(2147483647LLU<<1)) diff --git a/celt/fixed_generic.h b/celt/fixed_generic.h index ac01a43..0b17563 100644 --- a/celt/fixed_generic.h +++ b/celt/fixed_generic.h @@ -48,12 +48,6 @@ /** 32x32 multiplication, followed by a 31-bit shift right. Results fits in 32 bits */ #define MULT32_32_Q31(a,b) ADD32(ADD32(SHL(MULT16_16(SHR((a),16),SHR((b),16)),1),...
2015 Nov 12
2
[Aarch64 00/11] Patches to enable Aarch64
One other minor thing: I notice that in the inline assembly the result (rd) is constrained as an earlyclobber operand. What was the reason for that?
2015 Nov 16
3
[Fast Int64 1/4] Move OPUS_FAST_INT64 definition to celt/arch.h.
--- celt/arch.h | 5 +++++ silk/macros.h | 4 +--- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/celt/arch.h b/celt/arch.h index 9f74ddd..670527b 100644 --- a/celt/arch.h +++ b/celt/arch.h @@ -78,6 +78,11 @@ static OPUS_INLINE void _celt_fatal(const char *str, const char *file, int line) #define UADD32(a,b) ((a)+(b)) #define USUB32(a,b) ((a)-(b)) +/* Set this if opus_int64
2011 Sep 01
0
[PATCH 3/5] resample: Add NEON optimized inner_product_single for fixed point
...oint macro SATURATE32PSHR(x, shift, a). It does pretty much the same thing as SATURATE32(PSHR32(x, shift), a), but it avoids over flowing in rounding up phase in the rare occasion where x has already been saturated. It should also be slightly faster. --- libspeex/arch.h | 1 + libspeex/fixed_generic.h | 4 ++ libspeex/resample.c | 10 ++++- libspeex/resample_neon.h | 100 ++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 113 insertions(+), 2 deletions(-) create mode 100644 libspeex/resample_neon.h diff --git a/libspeex/arch.h b/libspeex/arch.h index 3b47ed9..daa72a7...
2005 May 25
0
Speex on TI C6x, Problem with TI C5x Patch
...s, > at the expense of a ~30% increase in MIPs. Now the male.wav file through > encoder/decoder produces a bit-exact match with the C64x test that I did > earlier. I will do some more testing to isolate the, but it may be a few > days before I get to this task. As Jean-Marc says, fixed_generic should > work, unless the compiler becomes hopelessly confused by something. Maybe > this is a compiler bug. It's odd that it "almost" works with the fixed_generic.h. The easiest thing would be to gradually replace routines and see which one causes problem. It's most lik...
2019 May 27
0
opus-1.3.1 patch for ARM Cortex-M4F (single precision)
...t; QCONST16(.4f,15)) && (!st->analysis.valid || st->analysis.tonality > .3f) && (pitch_index > 1.26*st->prefilter_period || pitch_index < .79*st->prefilter_period)) pitch_change = 1; if (pf_on==0) diff -Naupr opus-1.3.1-vanilla/celt/fixed_generic.h opus-1.3.1/celt/fixed_generic.h --- opus-1.3.1-vanilla/celt/fixed_generic.h 2018-09-26 14:49:41 +0800 +++ opus-1.3.1/celt/fixed_generic.h 2019-05-27 17:16:07 +0800 @@ -65,10 +65,10 @@ #endif /** Compile-time conversion of float constant to 16-bit value */ -#define QCONST16(x,bits) ((opus_val1...
2015 Nov 21
8
[Aarch64 v2 10/18] Clean up some intrinsics-related wording in configure.
--- configure.ac | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/configure.ac b/configure.ac index f52d2c2..e1a6e9b 100644 --- a/configure.ac +++ b/configure.ac @@ -190,7 +190,7 @@ AC_ARG_ENABLE([rtcd], [enable_rtcd=yes]) AC_ARG_ENABLE([intrinsics], - [AS_HELP_STRING([--disable-intrinsics], [Disable intrinsics optimizations for ARM(float) X86(fixed)])],, +
2011 Sep 01
6
[PATCH 0/5] ARM NEON optimization for samplerate converter
...-resample-full-sinc-table conf flag resample: Add NEON optimized inner_product_single for fixed point configure.ac: Add ARM NEON support resample: Add NEON optimized inner_product_single for floating point configure.ac | 35 ++++++++ libspeex/arch.h | 1 + libspeex/fixed_generic.h | 4 + libspeex/resample.c | 14 +++- libspeex/resample_neon.h | 201 ++++++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 253 insertions(+), 2 deletions(-) create mode 100644 libspeex/resample_neon.h -- 1.7.4.1
2006 Apr 21
2
Major internal changes, TI DSP build change
...5 and 0x4006 on the > C6x. When I patch the value 0x4006 into the C55 build, the output matches > the C6x. The problem is that 2^15 evaluates to -32768 on the C55 and 32768 > on the C6x. Right on! > Applying our friend EXTEND32 causes the constant to evaluate correctly. In > fixed_generic.h, > #define QCONST16(x,bits) > ((spx_word16_t)((x)*((EXTEND32(1))<<(bits))+((EXTEND32(1))<<((bits)-1)))) Actually, this is a case for a simple cast to (spx_word32_t) because QCONST can be used in a static initialization and EXTEND32 *can* be defined as a function (e.g. for fixe...
2005 May 26
0
Speex on TI C6x, Problem with TI C5x Patch
...; ... > I see that your changes include adding an EXTEND32 to the line above. I > made ONLY this change to the 1.1.8 base that I am working from, and the > problem is gone. Good. So I guess you are still having your low MIPS count, right? How much is that? > I will go back to using fixed_generic.h for now, but it may still be > worthwhile to make a custom version that takes advantage of the compiler > intrinsics, which include 32-bit shifts, 16x16=32 and 32x16=32 MPY, and > 32+16x16=32 MAC (with and without rounding). The multiply arithmetic all > returns saturated results...
2007 Dec 12
0
4kbps sounds robotic on TMS320C64
...fine. When I took the same sample and encoded it in 4kpbs on the DSP side, then decoded it on the ARM side, playback sounds robotic. I wonder if I'm missing some macros (#define 's) when compiling for the DSP side, but this can't be the case since 6kbps sounds fine. I'll try your fixed_generic.h suggestion and let you know. Thanks, Andy ----- Original Message ---- From: Jean-Marc Valin <jean-marc.valin@usherbrooke.ca> To: Andy Ngo <ndno72-speex@yahoo.com> Cc: speex-dev@xiph.org Sent: Wednesday, December 12, 2007 3:52:33 PM Subject: Re: [Speex-dev] 4kbps sounds robotic on...