search for: smulwb

Displaying 17 results from an estimated 17 matches for "smulwb".

Did you mean: smulbb
2011 Jun 17
1
speex on arm
...elp.c' || echo './'`nb_celp.c arm-linux-gcc -DHAVE_CONFIG_H -I. -I. -I.. -I../include -g -O2 -c nb_celp.c -MT nb_celp.lo -MD -MP -MF .deps/nb_celp.TPlo -fPIC -DPIC -o .libs/nb_celp.o /tmp/ccHf25Hb.s: Assembler messages: /tmp/ccHf25Hb.s:1573: Error: selected processor does not support `smulwb lr,r5,r3' /tmp/ccHf25Hb.s:2844: Error: selected processor does not support `smulbb r0,r1,r2' /tmp/ccHf25Hb.s:2871: Error: selected processor does not support `smulbb r2,r1,r3' /tmp/ccHf25Hb.s:2911: Error: selected processor does not support `smulbb r0,r1,r2' /tmp/ccHf25Hb.s:4386: Er...
2005 Mar 25
2
Port speex to my iPAQ 1945
Hi I want to port speex to my pocket PC iPAQ1945 which has a Samsung processor 2410, an ARM9-based processor. I would like to write the specific optimized code for this chip. I had some experience at DSP chip and fixed-point coding but know nothing about embedded system and ARM. Could someone tell me some hint how to write optimized code for this pocket PC. If you can give me some links that will
2004 Aug 06
2
Speex on Nokia 6600
Hi, I have one question: Will Speex run in realtime (both encode / decode probably simulateusly) on Nokia 6600 --- basicaly ARM9 104MHz with Symbian 7s after porting to its C++ or Java? I am thinking mostly about the worst quality encoding (optionally duplex). Can this processor make it? Oh 6600 has something about 6mb memory if I remember good. Plase cc kangur@polcom.net in replies.
2014 Dec 29
2
[RFC][FFT][Fixed-Point][NEON] NEON-Optimize
Hi Timothy, It requires some extra effort if twiddles and input/output have different bit width. Since Opus uses int32 for twiddles, we are going to do the same thing. Thanks, Phil Wang -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not
2015 Jan 19
1
[RFC][FFT][Fixed-Point][NEON] NEON-Optimize
...> > It requires some extra effort if twiddles and input/output have > > different bit width. Since Opus uses int32 for twiddles, we are going > > to do the same thing. > > Actually, the existing Opus code has 16-bit twiddles, mostly because it makes > it possible to use smulwb on ARMv5E. That being said, I agree that for Neon it > makes sense to use 32-bit twiddles since there's no 16x32 multiplier. > > Cheers, > > Jean-Marc > > > > > > > Thanks, > > > > Phil Wang > > > > > > -- IMPORTANT NOTICE: T...
2013 May 17
1
[Patch]01-Add ARM5E macros
...POSSIBILITY OF SUCH DAMAGE. +*/ + +#ifndef FIXED_ARM5E_H +#define FIXED_ARM5E_H + +/** 16x32 multiplication, followed by a 16-bit shift right. Results fits in 32 bits */ +#undef MULT16_32_Q16 +static inline opus_val32 MULT16_32_Q16(opus_val16 a, opus_val32 b) +{ + int res; + __asm__( + "smulwb %0, %1, %2;\n" + : "=&r"(res) + : "%r"(b),"r"(a) + ); + return res; +} + + +/** 16x32 multiplication, followed by a 15-bit shift right. Results fits in 32 bits */ +#undef MULT16_32_Q15 +static inline opus_val32 MULT16_32_Q15(opus_val16 a, opus_...
2015 Dec 20
2
[Aarch64 v2 05/18] Add Neon intrinsics for Silk noise shape quantization.
...e significantly (and it's _probably_ fine), but I think you can actually do this faster while remaining bitexact. If you shift up the contents of coef32 by 15 bits (which you can do, since you are already transforming them specially for this platform), you can use vqdmulhq_s32() to emulate SMULWB. You then have to do the addition in a separate instruction, but because you can keep all of the results in 32-bit, you get double the parallelism and only need half as many multiplies (which have much higher latency than addition). Overall it should be faster, and match the C code exactly. &g...
2004 Aug 06
0
Speex on Nokia 6600
...the worst > quality encoding (optionally duplex). Can this processor make it? Oh 6600 > has something about 6mb memory if I remember good. I'm not sure the current code will do it (maybe?), but I think it can be done. On question: does this ARM support instruction like smlabb, smulbb, smulwb, ... ? Jean-Marc -- Jean-Marc Valin, M.Sc.A., ing. jr. LABORIUS (http://www.gel.usherb.ca/laborius) Université de Sherbrooke, Québec, Canada -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 190 byte...
2004 Nov 03
0
speex on TI C5x fixed-point DSP
Jean-Marc Valin wrote: >Well, I guess the first thing to look is whether your DSP can actually >do either 16x32=>48 or 16x32=>32 (keeping the MSBs), which is what the >smulwb does on ARM. If that's the case, you can gain a lot of speed (use >one instruction for 16x32 instead of three). Otherwise, replacing the >32x32 multiplies by 16x16 is probably a good thing. > > One thing I've noticed so far in the filter_mem2 code is the calls to SATURATE(x,...
2005 Mar 27
0
Port speex to my iPAQ 1945
..., > > Assuming you have gcc, you can already compile with --enable-arm5e-asm > and get good results. Right now, many places use ARM4 assembly even on > ARM5E, so if you want even better results, you can rewrite those. The > main instructions you'll want to use are smulbb, smlabb, smulwb and > smlawb, which aren't present in ARM4 and are usually more efficient than > mul, smull and mla. > > Jean-Marc >
2014 Dec 29
0
[RFC][FFT][Fixed-Point][NEON] NEON-Optimize
...14 11:04 PM, Phil Wang wrote: > It requires some extra effort if twiddles and input/output have > different bit width. Since Opus uses int32 for twiddles, we are going > to do the same thing. Actually, the existing Opus code has 16-bit twiddles, mostly because it makes it possible to use smulwb on ARMv5E. That being said, I agree that for Neon it makes sense to use 32-bit twiddles since there's no 16x32 multiplier. Cheers, Jean-Marc > > > Thanks, > > Phil Wang > > > -- IMPORTANT NOTICE: The contents of this email and any attachments are > confident...
2015 Dec 21
0
[Aarch64 v2 05/18] Add Neon intrinsics for Silk noise shape quantization.
...#39;s _probably_ fine), but I think you can actually > do this faster while remaining bitexact. > > If you shift up the contents of coef32 by 15 bits (which you can do, > since you are already transforming them specially for this platform), > you can use vqdmulhq_s32() to emulate SMULWB. You then have to do the > addition in a separate instruction, but because you can keep all of the > results in 32-bit, you get double the parallelism and only need half as > many multiplies (which have much higher latency than addition). Overall > it should be faster, and match the...
2004 Nov 01
4
speex on TI C5x fixed-point DSP
Jean-Marc Valin wrote: >>I have the encoder and decoder running now and have verified that the >>encoder is bit-exact wrt to the fixed-point code running on x86 for the >>same 30-second audio sample. Encode and decode together run in >>real-time for 8KHz data, complexity=3, on 120MHz C5509 when code and >>data are all in on-chip SRAM. I have not tested the
2004 Aug 06
3
Speex on Nokia 6600
...encoding (optionally duplex). Can this processor make it? Oh 6600 > > has something about 6mb memory if I remember good. > > I'm not sure the current code will do it (maybe?), but I think it can be > done. On question: does this ARM support instruction like smlabb, > smulbb, smulwb, ... ? > > Jean-Marc --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'speex-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is n...
2004 Aug 06
4
SmartPhone ARM
Hello Greg If money isn't a problem Intel has an optimized compiler for eVC and XScale processors http://www.intel.com/software/products/compilers/techtopics/PCA_Optimization_WP.pdf If you have any luck getting the eVC compiler closer to realtime I'd really like to know. I'm still far from realtime when using Speex 1.1.3 on a HP iPAQ (Intel pxa255). Best regards Bjoern D.
2015 Jul 19
4
Bug in ARM fixed-point ASM?
Hi, folks, I've been hunting down some strange bugs in audio I've been doing. While hunting my bugs down, I tripped across what appears to be an Opus bug, but it's not clear where it's coming from. Note that the optimization choices differ between the two in the config.log below. How can I force them to be the same? Presumably I need to force the android version toward the
2015 Nov 21
12
[Aarch64 v2 00/18] Patches to enable Aarch64 (version 2)
As promised, here's a re-send of all my Aarch64 patches, following comments by John Ridges. Note that they actually affect more than just Aarch64 -- other than the ones specifically guarded by AARCH64_NEON defines, the Neon intrinsics all also apply on armv7; and the OPUS_FAST_INT64 patches apply on any 64-bit machine. The patches should largely be independent and independently useful, other