thr3ads.net - Speex dev - [Speex-dev] Speex on TI C6x, Problem with TI C5x Patch [May 2005]

If this information is useful, please help other people find it:
Share via:

Jim Crichton

2005-May-26 05:17 UTC

[Speex-dev] Speex on TI C6x, Problem with TI C5x Patch

Jean-Marc,
>> > It's odd that it "almost" works with the
fixed_generic.h. The easiest
>> > thing would be to gradually replace routines and see which one
causes
>> > problem. It's most likely (though I'm not 100% sure) that
somewhere in
>> > the code, I have a 16-bit value that gets sent to a function/macro
that
>> > expects a 32-bit value. In that case, Stuart's fixed_c55x.h
file would
>> > work because the inline function would first force the conversion
to 32
>> > bits. My initial guess would be with the SH[RL]32 and ADD32
functions.
Nice call.  The culprit is SHRL32.  This is not used in many places, and in 
most of those, the operand comes from EXTEND32.  The only really suspicious 
instances is in lsp.c (lsp_interpolate):

   spx_word16_t tmp = DIV32_16(SHL32(1 + subframe,14),nb_subframes); 
(subframe is an int)
> I just checked in changes that may fix some bad behavior on 16-bit
> architectures. You might want to try if it fixes the problem. As I said,
> it's not clear yet whether the problem is in fixed_generic.h or if
it's
> somewhere else and Stuart's fixed_c55.h just happens to work around it.
I see that your changes include adding an EXTEND32 to the line above.  I 
made ONLY this change to the 1.1.8 base that I am working from, and the 
problem is gone.

I will go back to using fixed_generic.h for now, but it may still be 
worthwhile to make a custom version that takes advantage of the compiler 
intrinsics, which include 32-bit shifts, 16x16=32 and 32x16=32 MPY, and 
32+16x16=32 MAC (with and without rounding).  The multiply arithmetic all 
returns saturated results.  It seems to me that can't hurt.  Do you see any 
problem with using saturated MPY and MACs?

By the way, the lsp.c change also fixes the TI C54x build, as expected.

- Jim Crichton

Jean-Marc Valin

2005-May-26 08:03 UTC

head link

[Speex-dev] Speex on TI C6x, Problem with TI C5x Patch

> Nice call.  The culprit is SHRL32.  This is not used in many places, and in
> most of those, the operand comes from EXTEND32.  The only really suspicious
> instances is in lsp.c (lsp_interpolate):
> 
>    spx_word16_t tmp = DIV32_16(SHL32(1 + subframe,14),nb_subframes); 
> (subframe is an int)
> ...
> I see that your changes include adding an EXTEND32 to the line above.  I 
> made ONLY this change to the 1.1.8 base that I am working from, and the 
> problem is gone.
Good. So I guess you are still having your low MIPS count, right? How
much is that?
> I will go back to using fixed_generic.h for now, but it may still be 
> worthwhile to make a custom version that takes advantage of the compiler 
> intrinsics, which include 32-bit shifts, 16x16=32 and 32x16=32 MPY, and 
> 32+16x16=32 MAC (with and without rounding).  The multiply arithmetic all 
> returns saturated results.  It seems to me that can't hurt.  Do you see
any
> problem with using saturated MPY and MACs?
Saturation never hurts. That's actually what I would have used if all
platforms supported it (which is at least not the case when ARMv4). If
you send me a file with only the macros you want to override from
fixed_generic.h, I can include it in the tree.
> By the way, the lsp.c change also fixes the TI C54x build, as expected.
What do you mean? Before that, it worked on C55, but not C54?

	Jean-Marc

-- 
Jean-Marc Valin <Jean-Marc.Valin@USherbrooke.ca>
Universit? de Sherbrooke

Jim Crichton

2005-May-26 11:50 UTC

head link

[Speex-dev] Speex on TI C6x, Problem with TI C5x Patch

>> Nice call.  The culprit is SHRL32.  This is not used in many places,
and
>> in
>> most of those, the operand comes from EXTEND32.  The only really 
>> suspicious
>> instances is in lsp.c (lsp_interpolate):
>>
>>    spx_word16_t tmp = DIV32_16(SHL32(1 + subframe,14),nb_subframes);
>> (subframe is an int)
>> ...
>> I see that your changes include adding an EXTEND32 to the line above. 
I
>> made ONLY this change to the 1.1.8 base that I am working from, and the
>> problem is gone.
>
> Good. So I guess you are still having your low MIPS count, right? How
> much is that?
About 42 MIPs peak for encode/decode, complexity 1, 8kbps, no preprocessor 
or jitter buffer or echo canceller (yet).
>> I will go back to using fixed_generic.h for now, but it may still be
>> worthwhile to make a custom version that takes advantage of the
compiler
>> intrinsics, which include 32-bit shifts, 16x16=32 and 32x16=32 MPY, and
>> 32+16x16=32 MAC (with and without rounding).  The multiply arithmetic
all
>> returns saturated results.  It seems to me that can't hurt.  Do you
see
>> any
>> problem with using saturated MPY and MACs?
>
> Saturation never hurts. That's actually what I would have used if all
> platforms supported it (which is at least not the case when ARMv4). If
> you send me a file with only the macros you want to override from
> fixed_generic.h, I can include it in the tree.
Good, but I will not get around to this for a while.  My next step is 
getting a multichannel version working on C64x.
>> By the way, the lsp.c change also fixes the TI C54x build, as expected.
>
> What do you mean? Before that, it worked on C55, but not C54?
No, they have always produced the same results.  The C54x seems to take 
about 5x as many cycles to do the task.  I can see that 32x32 multiplies are 
being done by library calls rather than inline, so I will play with this a 
bit more.  Others have used C54x successfully, so significant improvement is 
possible.

- Jim Crichton

Apparently Analagous Threads

Search for more seemingly similar threads

Speex dev - May 2005 - Speex on TI C6x, Problem with TI C5x Patch

[Speex-dev] Speex on TI C6x, Problem with TI C5x Patch

[Speex-dev] Speex on TI C6x, Problem with TI C5x Patch

[Speex-dev] Speex on TI C6x, Problem with TI C5x Patch

Apparently Analagous Threads