Displaying 2 results from an estimated 2 matches for "_dotp2".
Did you mean:
_dot_
2006 Feb 03
2
Speex inner_prod()
Hi,
Basically, inner_prod() can and should be adapted to the architecture it
will run on. It is not really sensitive to noise, so it's possible to
tweak it a lot. Also, in the current code, I saturate it to +-16384,
which is OK to prevent overflows. I'm not concerned with the case of a
constant -16384 value because it can't really happen in practice
(especially after filtering). BTW,
2006 Feb 04
0
Speex inner_prod(), normalize, C64 MIPS
...the summation of four
> mults overflowing the 32 bit before the shift.
>
> I can fix this by accumulating each term into a long, but if the code
scales
> the x[],y[] vectors to avoid this problem I could use parallel 16x16
> multiply/adds.
What do you mean here?
The C64x has a _dotp2() instruction that does two 16x16 multiplies and adds
the products together. Since the values are scaled to 16384, I can add the
results of the two _dotp2()s together before the long add without worrying
about overflow. I didn't understand that inner_prod() was always passed
scaled vectors....