thr3ads.net - Vorbis dev - [vorbis-dev] integer pcm decode patch [Apr 2000]

If this information is useful, please help other people find it:
Share via:

robert@moon.eorbit.net

2000-Apr-19 13:41 UTC

[vorbis-dev] integer pcm decode patch

Hi!

I've spent the last few nights digging into the Vorbis source and working
to implement a vorbis_synthesis_pcmout_int() function that kicks out
interleaved int16_t pcm data.

I think its important to have this function available to make the
job for people using the codec a little easier. This function abstracts
out the conversion to int16_t and removes the extra overhead of
moving the pcm data over the processors data bus just to do the int16_t
conversion after the vorbis synthesize.

I've written that function and I've created a linux command line player
that reads a .ogg from stdin and writes the pcm samples to the OSS
sound drivers using the the vorbis_synthesis_pcmout_int() function.
The linux command line player (that's giving it too much credit, really)
is just like the encode/decode example limited to 44khz files.

The patch for these features can be found here:

   http://moon.eorbit.net/~robert/pcm16.patch

There is one slight problem with the vorbis_synthesis_pcmout_int()
function that I hope we can all solve as a team. I personally failed
to solve this problem, but I think it important to get this function into
the codebase before the first release. I trust once the format and the
interface settle there will be more focus on optimization and at that
time we should be able to solve the problem, if not before.

The problem occurs in block.c line 666 (!) and line 669 of the
patched sources. These two lines are the lines that I chose for
doing the double to int16_t conversion. The problem is that for
my test case this solution was actually a bit slower than the
normal vorbis_synthesis_pcmout() call, when the original goal of
this function was to make the decoder more efficient.

The problem is that with the synthesizer spitting out int16_t data it was
actually doing slightly more than twice the number of float multiplies than
the code that does the conversion as the very last step. This is due to the
nature of the synthesizer, since it may make multiple passes over the pcm
data as it overlaps/adds two windows together.

Finding a better place to do the int16_t conversion is the key solving this
problem.  Once solution that I looked into was having the mdct kick out
int16_t values, but without having access to and understanding the original
paper that the mdct code was based on I only made things worse.

I should really go an pay some attention to my girlfriend now...     

--ruaok         Freezerburn! All else is only icing. -- Soul Coughing

Robert Kaye -- robert@moon.eorbit.net  http://moon.eorbit.net/~robert

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/

Jonathan Blow

2000-Apr-19 18:36 UTC

head link

[vorbis-dev] integer pcm decode patch

robert@moon.eorbit.net wrote:
> Finding a better place to do the int16_t conversion is the key solving this
> problem.  Once solution that I looked into was having the mdct kick out
> int16_t values, but without having access to and understanding the original
> paper that the mdct code was based on I only made things worse.
Hi, I am using WinCVS and don't want to get cvs working on my linux machines
right now, and don't have patch on my win32 machine.  So I am reading
straight from the patch file.  I assume this is the code you are talking
about:

+         for(i=beginSl,t=beginSl;i<endSl;i++,t+=vi->channels)
+           pcm_int[t+j]+=vb->pcm[j][i]*32767.;
+         /* the remaining section */
+         for(;i<sizeW;i++,t+=vi->channels)
+           pcm_int[t+j]=vb->pcm[j][i]*32767.;

Here is my question: would it help to have a faster way to convert
from double to integer?

Being a game developer I possess some arcane knowledge about how to
do wacky things with IEEE-754 floating point numbers.  A standard trick
that we do in games is to manipulate this format directly to squeeze out
the values we want.  This does two things: it (a) eliminates floating
point multiplies, which are more expensive than adds on some processors,
and (b) it eliminates the _ftol or equivalent code that the compiler
sticks in when you go from any float size to integer (This stuff is there
to enforce the IEEE rounding semantics which, most of the time you are
converting float to int, you don't really care about... and it is this
stuff that usually takes all the time).

For example here is some code that converts from a double to a machine-word-
sized integer (without scaling the double)... uhh this is actually C++
so it has a reference and stuff, but it can be done without that:

----------------------------------------
static const double fix32_conv_factor = ((double)0x10000000) * 256.0 * 1.5;
static const double int_conv_factor = (fix32_conv_factor * (double)0x10000);

const int LOW_WORD_OFFSET = 0;

inline long iDOUBLE_TO_INT(double d) {
    d += int_conv_factor;
    const long *const &num = (long *)&d + LOW_WORD_OFFSET;
    return *num;
}
----------------------------------------

The value of LOW_WORD_OFFSET changes depending on the architecture (on a 
sparc, you want it to be 1).

Anyway to scale the number, you add a different value to the double
in the first line of iDOUBLE_TO_INT.  So for example, if you want to
multiply the double by 32768, just use 'fix32_conv_factor' instead
of 'int_conv_factor':

----------------------------------------
inline long iDOUBLE_TO_INT(double d) {
    d += fix32_conv_factor;
    const long *const &num = (long *)&d + LOW_WORD_OFFSET;
    return *num;
}
----------------------------------------

Assuming that your compiler (gcc?) is generating rounding code for your
cast, which it probably is, this is going to be hella faster.

If you want to know why this actually works, I have a book in progress
that talks about this stuff, I can forward you the draft chapters of
that.

    -Jonathan.

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/

Michael Smith

2000-Apr-20 05:52 UTC

head link

[vorbis-dev] integer pcm decode patch

At 01:41 PM 4/19/00 -0700, you wrote:>Hi!
>
>I've spent the last few nights digging into the Vorbis source and
working
>to implement a vorbis_synthesis_pcmout_int() function that kicks out
>interleaved int16_t pcm data.
>
>I think its important to have this function available to make the
>job for people using the codec a little easier. This function abstracts
>out the conversion to int16_t and removes the extra overhead of
>moving the pcm data over the processors data bus just to do the int16_t
>conversion after the vorbis synthesize.
vorbisfile does this. It's also an order of magnitude easier to use - a
minimal example requires precisely three functions. It leaves the
conversion to 16 bit ints to later (after vorbis_synthesis_pcmout()) - but
realistically, that data will be sitting around in L2 cache most likely, so
there isn't significant extra overhead. Right now, it's pretty slow. A
minor modification to a single line makes it roughly 10% faster (for fully
decoding a single bitstream, ~4 minutes long), so I'll clean up that
(it's
all in ov_read()) and commit it. Well, after making sure it DOES give the
same results (which it should).
>
>I've written that function and I've created a linux command line
player
>that reads a .ogg from stdin and writes the pcm samples to the OSS
>sound drivers using the the vorbis_synthesis_pcmout_int() function.
>The linux command line player (that's giving it too much credit, really)
>is just like the encode/decode example limited to 44khz files.
Thanks, this player might be useful for when I can't be bothered firing up
xmms. I'll try it out later.
>
>The patch for these features can be found here:
>
>   http://moon.eorbit.net/~robert/pcm16.patch
>
>There is one slight problem with the vorbis_synthesis_pcmout_int()
>function that I hope we can all solve as a team. I personally failed
>to solve this problem, but I think it important to get this function into
>the codebase before the first release. I trust once the format and the
>interface settle there will be more focus on optimization and at that
>time we should be able to solve the problem, if not before.
>
>The problem occurs in block.c line 666 (!) and line 669 of the
>patched sources. These two lines are the lines that I chose for
>doing the double to int16_t conversion. The problem is that for
>my test case this solution was actually a bit slower than the
>normal vorbis_synthesis_pcmout() call, when the original goal of
>this function was to make the decoder more efficient.
Well, I think this is probably the wrong place to do it - though it might
be advantagous to have something akin to the sample conversion routines in
vorbisfile moved into libvorbis itself - it's definately best done AFTER
the rest of the decoding, in my opinion.
>
>The problem is that with the synthesizer spitting out int16_t data it was
>actually doing slightly more than twice the number of float multiplies than
>the code that does the conversion as the very last step. This is due to the
>nature of the synthesizer, since it may make multiple passes over the pcm
>data as it overlaps/adds two windows together.
>
>Finding a better place to do the int16_t conversion is the key solving this
>problem.  Once solution that I looked into was having the mdct kick out
>int16_t values, but without having access to and understanding the original
>paper that the mdct code was based on I only made things worse.
>
>I should really go an pay some attention to my girlfriend now...     
>
>
>--ruaok         Freezerburn! All else is only icing. -- Soul Coughing
>
>Robert Kaye -- robert@moon.eorbit.net  http://moon.eorbit.net/~robert
--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/

David Balazic

2000-May-03 08:53 UTC

head link

[vorbis-dev] integer pcm decode patch

Jonathan Blow (jon@bolt-action.com) wrote :
>    Being a game developer I possess some arcane knowledge about how to
>    do wacky things with IEEE-754 floating point numbers. A standard trick
>    that we do in games is to manipulate this format directly to squeeze out
>    the values we want. This does two things: it (a) eliminates floating
>    point multiplies, which are more expensive than adds on some processors,
>    and (b) it eliminates the _ftol or equivalent code that the compiler 
>    sticks in when you go from any float size to integer (This stuff is
there
>    to enforce the IEEE rounding semantics which, most of the time you are
>    converting float to int, you don't really care about... and it is
this
>    stuff that usually takes all the time).
I think correct rounding IS important. I noticed some code ( not in
vorbis )
uses simple assigments :

int i; float a;
i = a;

Which is wrong for signed audio data because it round towards zero,
while
correct would be rounding to nearest.

The error is small, but the whole idea behind vorbis is go have
_good_quality_
at low space usage , right ?

Dithering might be considered too, for that matter .

Regards 
David Balazic

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/

Seemingly Similar Threads

Search for more possibly parallel threads

Vorbis dev - Apr 2000 - integer pcm decode patch

[vorbis-dev] integer pcm decode patch

[vorbis-dev] integer pcm decode patch

[vorbis-dev] integer pcm decode patch

[vorbis-dev] integer pcm decode patch

Seemingly Similar Threads