thr3ads.net - Vorbis dev - [vorbis-dev] mdct_backward with fused muladd? [May 2003]

If this information is useful, please help other people find it:
Share via:

David Etherton

2003-May-20 15:12 UTC

[vorbis-dev] mdct_backward with fused muladd?

Can anybody point me at any resources that would explain how to optimize
mdct_backward for a cpu with a fused multiply-accumute unit?
>From what I understand from responses to my older postings, Tremor'smdct_backward could be rewritten to take advantage of a muladd.

My target machine can do either two-wide 32x32 + Accum(64) -> Accum(64)
integer muladd or eight-wide 16x16 + Accum(32) -> Accum(32) integer muladd
or four-wide single-precision floating-point muladd.

The tremor code seems to be much cleaner and more portable than the stock
version for consoles (no double-precision math routines, compiles more or
less out-of-the-box on a C++ compiler) but I can afford an int-to-float if
necessary.

What values of 'n' does mdct_backward typically get called with?  Should
it
be pretty simple to guarantee proper alignment of the input buffers to a
16-byte boundary?  Can I get away with 16x16 multiplies without too much
audio degredation?

I also would be better off without a big sincos lut as pointed out by Segher
Boessenkool back in March.

Thanks again.  Just to show that I'm not a total leech, here's a
slightly
faster (at least on the PS2) version of bitrev12 that doesn't use any luts
(thanks to http://aggregate.org/MAGIC/)

STIN int bitrev12(int x){
x = ((x & 0xaaa) >> 1) | ((x & 0x555) << 1);
x = ((x & 0xccc) >> 2) | ((x & 0x333) << 2);
x = ((x & 0xf00) >> 8) | (x & 0x0f0) | ((x & 0x00f) <<
8);
return x;
}

-Dave

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Michael Smith

2003-May-20 16:34 UTC

head link

[vorbis-dev] mdct_backward with fused muladd?

On Wednesday 21 May 2003 08:12, David Etherton wrote:> Can anybody point me at any resources that would explain how to optimize
> mdct_backward for a cpu with a fused multiply-accumute unit?
MDCT optimisation is not my area of expertise, but I'll give some other
advice
anyway...
>
> From what I understand from responses to my older postings, Tremor's
> mdct_backward could be rewritten to take advantage of a muladd.
>
> My target machine can do either two-wide 32x32 + Accum(64) -> Accum(64)
> integer muladd or eight-wide 16x16 + Accum(32) -> Accum(32) integer
muladd
> or four-wide single-precision floating-point muladd.
>
> The tremor code seems to be much cleaner and more portable than the stock
> version for consoles (no double-precision math routines, compiles more or
> less out-of-the-box on a C++ compiler) but I can afford an int-to-float if
> necessary.
Well... it's _only_ a decoder, whereas the stock version includes the
encoder.
This naturally makes it a lot simpler - most of the complexities are in the 
encoder (for example, no double precision floats are needed for the decoder).
>
> What values of 'n' does mdct_backward typically get called with? 
Should it
> be pretty simple to guarantee proper alignment of the input buffers to a
> 16-byte boundary?  Can I get away with 16x16 multiplies without too much
> audio degredation?
I think multiples of 2 from 64 to 8192 are allowed, and the most common will 
be 128 (or 256) and 2048 (or 4096 at very low bitrates). I'd have to check 
those, though. Alignment should be simple enough to guarantee. 16x16 
multiplies probably won't give acceptable audio quality.

<p>>> I also would be better off without a big sincos lut as pointed out by
> Segher Boessenkool back in March.
If this is because of the memory usage of the luts, you may be interested in 
looking at the Tremor 'lowmem-branch' branch, out of cvs. It uses
(I'm told,
I haven't tried it myself) about an order of magnitude less memory 
(heap+stack). That's at a cost of marginally higher cpu usage (10-15%?), but
that might be a worthwhile tradeoff on a console.

Mike

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Nicolas Pitre

2003-May-20 19:16 UTC

head link

[vorbis-dev] mdct_backward with fused muladd?

On Tue, 20 May 2003, David Etherton wrote:
> Can anybody point me at any resources that would explain how to optimize
> mdct_backward for a cpu with a fused multiply-accumute unit?
This was discussed on this list some time ago.
> From what I understand from responses to my older postings, Tremor's
> mdct_backward could be rewritten to take advantage of a muladd.
Well in fact you could start with the current Tremor code where the XPROD 
macros can easily be redefined specifically for your target.  You can look 
at the ARM version where the 32x32->64 MAC instruction is used for example.
> Can I get away with 16x16 multiplies without too much
> audio degredation?
No.  See the definition of MULT32() and MULT31() for the _LOW_ACCURACY_
case.  I had to make the multiplication unbalanced (like 24x8) since a 16x16
would not give acceptable audio quality.
> Thanks again.  Just to show that I'm not a total leech, here's a
slightly
> faster (at least on the PS2) version of bitrev12 that doesn't use any
luts
> (thanks to http://aggregate.org/MAGIC/)
> 
> STIN int bitrev12(int x){
> x = ((x & 0xaaa) >> 1) | ((x & 0x555) << 1);
> x = ((x & 0xccc) >> 2) | ((x & 0x333) << 2);
> x = ((x & 0xf00) >> 8) | (x & 0x0f0) | ((x & 0x00f)
<< 8);
> return x;
> }
Yeah, that's the generic cookbook method.  This however sucks big time on
ARM where immediate arguments can be used directly within opcodes only if
they're not wider than 8 bits.  The current version was optimized for ARM.

<p>Nicolas

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Maybe Matching Threads

Search for more maybe matching threads

Vorbis dev - May 2003 - mdct_backward with fused muladd?

[vorbis-dev] mdct_backward with fused muladd?

[vorbis-dev] mdct_backward with fused muladd?

[vorbis-dev] mdct_backward with fused muladd?

Maybe Matching Threads