thr3ads.net - search: "

Displaying 2 results from an estimated 2 matches for "_dotp2".

Did you mean: _dot_

2006 Feb 03

Speex inner_prod()

Hi, Basically, inner_prod() can and should be adapted to the architecture it will run on. It is not really sensitive to noise, so it's possible to tweak it a lot. Also, in the current code, I saturate it to +-16384, which is OK to prevent overflows. I'm not concerned with the case of a constant -16384 value because it can't really happen in practice (especially after filtering). BTW,

Speex inner_prod(), normalize, C64 MIPS

2006 Feb 04

Speex inner_prod(), normalize, C64 MIPS

...the summation of four > mults overflowing the 32 bit before the shift. > > I can fix this by accumulating each term into a long, but if the code scales > the x[],y[] vectors to avoid this problem I could use parallel 16x16 > multiply/adds. What do you mean here? The C64x has a _dotp2() instruction that does two 16x16 multiplies and adds the products together. Since the values are scaled to 16384, I can add the results of the two _dotp2()s together before the long add without worrying about overflow. I didn't understand that inner_prod() was always passed scaled vectors....

search for: _dotp2