Jean-Marc Valin
2005-Aug-17 16:57 UTC
[Speex-dev] Updated MIPs and memory requirements for TI c54x or c55DSPs
Hi, Just a couple tips to reduce complexity. First, I think you'd get a good speedup by enabling the PRECISION16 switch (if it's not done already). This (very) slightly reduces quality, but means you convert a lot of "emulated" 16x32 multiplications into 16x16. There are also several routines that would benefit from platform-specific optimizations. There are already optimizations for ARM (*_arm4.h), Blackfin (*_bfin.h) and SSE (*_sse.h), so you can see what functions are worth optimizing. For a DSP, there are also two specific things I would watch. First, there is at least one CPU-intensive place (inner_prod) where I have to use many shifts in a loop to prevent overflows, but those could be replaced by a single shift at the end when using a 40-bit accumulator (which can also replace the only use I have of spx_word64_t). The second thing is in the filters (*_mem2() functions). The reference implementation assumes that writes are as fast as reads. This is often not true on DSPs (at least on Blackfin), so it is possible to re-write the algorithm to take that into account. There is an example (in C) for filter_mem2() and compute_impulse_response() at the end of filters_bfin.h. Hope this helps. Jean-Marc Le mercredi 17 ao?t 2005 ? 16:41 -0400, Jim Crichton a ?crit :> The 42 MIP number was from my post of 24 May, and it was for C55x and > Speex 1.1.8. I tested 1.1.10 yesterday, and the number went down very > slightly; it was just over 41 MIPs peak measured in 20 ms blocks on > MALE.WAV, for encoder/decoder loop, 8kbps narrowband with minimum > complexity. C54x was just awful in comparison (>200 MIPs, not enough > for real time), and I abandoned that family and switched to the > C5509A. At least one other user has made some optimizations for C54x > and gotten the MIPs down some, but the C55x family seems to like Speex > a lot better (32-bit math support is more efficient). I had promised > Jean-Marc to make a reference build for others to look at, but I have > not got around to that yet. I can send you the .pjt and .cmd files > that I used, if you like.-- Jean-Marc Valin <Jean-Marc.Valin@USherbrooke.ca> Universit? de Sherbrooke
Jim Crichton
2005-Aug-18 13:18 UTC
[Speex-dev] Patch, related to TI DSP C54x C55x C6x builds
Jean-Marc, I have attached a small patch with modifications to arch.h, bits.c, and misc.c. This contains the few mods remaining to support the various fixed point TI DSPs after the work that you did at the end of May (thank you for this). arch.h: Add switch for compilers not supporting "long long" (C55x does, C54x and older C64x does not) bits.c: Allow external definition for max buffer size, change MAX_BYTES_PER_FRAME to MAX_CHARS_PER_FRAME for consistency misc.c: Added override switches to alloc routines, conditional include of user file changes to allow manual memory allocation rather than using heap Note that none of these changes are necessary to get Speex to run on C55x. The arch.h change allows operation with older versions of Code Composer Studio. The bits.c change reduces the data memory usage, for applications running a subset of the Speex coding modes. The misc.c change allows private memory allocation, for cases where it is not desirable to use the normal heap. This makes it possible to sort of comply with the TI algorithm interface standard (xDAIS). I changed my code today to follow the pattern used for the Sharc port (OVERRIDE switches), but you might prefer a different style for including the user-provided allocation routines. Regards, Jim Crichton -------------- next part -------------- A non-text attachment was scrubbed... Name: speex1-1-10-to-c55x.patch Type: application/octet-stream Size: 2882 bytes Desc: not available Url : http://lists.xiph.org/pipermail/speex-dev/attachments/20050818/7ea8ca40/speex1-1-10-to-c55x.obj -------------- next part -------------- These are all of the changes and additions necessary to build a loopback application for the TI C5509A simulator. This build runs 8kbps narrowband, with minimum complexity. Changed files (from 1.1.10 base): arch.h: Add switch for compilers not supporting "long long" (C55x does, C54x, old C64x does not) bits.c: Allow external definition for max buffer size, change MAX_BYTES_PER_FRAME to MAX_CHARS_PER_FRAME for consistency misc.c: Added override switches to alloc routines, conditional include of user file changes to allow manual memory allocation rather than using heap Note that none of these changes are necessary to get Speex to run on C55x. The arch.h change allows operation with older versions of Code Composer Studio. The bits.c change reduces the data memory usage. The misc.c change allows private memory allocation, for cases where it is not desirable to use the normal heap. Added files: include\config.h (not automatically generated, sets memory sizes, enables manual alloc) include\speex\speex_config_types.h (match Speex types to compiler types, not generated from types.in) speex_c55_test\speex_c55_test.cmd (C5509A linker command file) speex_c55_test\speex_c55_test.pjt (Code Composer Studio Project File ) src\boot.asm (to force wait states to 0 for the simulator, otherwise cycle count is much too high) src\testenc-54x.c (derived from testenc.c, manual alloc, byte packing/unpacking added) src\user_misc.h (contains the manual memory alloc routines, with debug code )
Jean-Marc Valin
2005-Aug-19 08:39 UTC
[Speex-dev] Re: Patch, related to TI DSP C54x C55x C6x builds
Hi Jim, Thank for the patch. I'll apply it when I have a few minutes. If I haven't done so after a few weeks, please send it again. I'm in the process of relocating to Australia, so everything's a bit of a mess around here. Also, please post the c5X-specific files to the list (.cmd, .pjt, ...) so they'll be archived. Last thing, I see you defined spx_word64_t as long long for the C5X, isn't there a way to tell the compiler that the 40-bit accumulator is fine? Jean-Marc Le jeudi 18 ao?t 2005 ? 16:18 -0400, Jim Crichton a ?crit :> Jean-Marc, > > I have attached a small patch with modifications to arch.h, bits.c, and > misc.c. This contains the few mods remaining to support the various fixed > point TI DSPs after the work that you did at the end of May (thank you for > this). > > arch.h: Add switch for compilers not supporting "long long" (C55x does, C54x > and older C64x does not) > bits.c: Allow external definition for max buffer size, change > MAX_BYTES_PER_FRAME > to MAX_CHARS_PER_FRAME for consistency > misc.c: Added override switches to alloc routines, conditional include of > user file > changes to allow manual memory allocation rather than using heap > > Note that none of these changes are necessary to get Speex to run on C55x. > > The arch.h change allows operation with older versions of Code Composer > Studio. > The bits.c change reduces the data memory usage, for applications running a > subset of the Speex coding modes. > The misc.c change allows private memory allocation, for cases where it is > not desirable to use the normal heap. > This makes it possible to sort of comply with the TI algorithm interface > standard (xDAIS). > > I changed my code today to follow the pattern used for the Sharc port > (OVERRIDE switches), but you might prefer a different style for including > the user-provided allocation routines. > > Regards, > > Jim Crichton-- Jean-Marc Valin <Jean-Marc.Valin@USherbrooke.ca> Universit? de Sherbrooke