thr3ads.net - Vorbis - [vorbis] encoder block diagram [Mar 2003]

If this information is useful, please help other people find it:
Share via:

stoffke@directbox.com

2003-Mar-12 13:58 UTC

[vorbis] encoder block diagram

I've made a block diagram of the encoder because I tried to find out, how it
works

http://stoffke.freeshell.65535.net/ogg/block.html

Although there are  specifiation docus, that give very 
detailed information about single aspects of the encoding (or decoding) ,
I'm missing  documenations that give a more general overview,
about how the encoder works.
(Vorbis Illuminated seems a bit outdated, as well as on2)
 
Here is a brief description of encoding process (as I understood it)

WINDOWING
- Vorbis uses overlapping windows with sizes between 64 and 8192 Samples (powers
of two)
- short blocks and one long blocks are used (short blocks must be smaller or
equal to long blocks), can be set to any allowed size
- selected window size depends on bitrate

MDCT
- transforms audio data to frequency domain
 
PSYCHOACOUSTIC MODEL
- Vorbis uses its own psychoacoustic model
- FFT for tonal analys and MDCT for noise analysis

Floor 
- a psychoacoustic floor is created from the data, given from the
ps. model 
- the floor is a spectral envelope and represents a low resolution
model of the audio spectrum
- floor type 0 uses LSP and floortyp 1 a linear interpolation algorithm
to compute the floor curve
? currently only floor type 1 is used
? don't know whether the MDCT input for the psychoacoustic model come from 
MDCT
above or an extra MDCT is performed (would that make sense at all ?)
- the floor data are then subtracted (amplitude-wise) from the MDCT data
creating a "residue"
- the residue represents the spectral fine structure of the audio signal

CHANNEL COUPLING
- channel coupling reduces the redundacy of left and right channel
- it works  good, because there's a high correlation between the floor
curves of both channels
- Vorbis has different types of stereo models: dual stereo, lossless stereo (- q
6 to -q 10),
phase stereo and a mixed stereo (all the modes together)
? although vorbis supports up to 255 channels, there's no channel coupling 
in streams
 more than 2 channels (yet)
? not sure about the position of channel coupling in the diagram

VECTOR QUANTIZATION
- the floor data and the residues are vector quantized by using
custom codebooks
- codebooks are adaptive and are "trained"

HUFFMAN
- the vector - codewords are then huffman-coded to minimize redundancy

finally the data are then packed into a bitstream
 
Please correct or comment the diagram and the description.

I'm not skilled in C , so I can't  "read" the  sourcecode.
But I tried to get the information from the specs,
and the mailing lists was also helpful.

I need  information  about vorbis for my diploma thesis.

Thanks a lot

Stoffke

<p>--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

stoffke@directbox.com

2003-Mar-12 14:53 UTC

head link

[vorbis] encoder block diagram

I've made a block diagram of the encoder because I tried to find out, how it
works

http://stoffke.freeshell.65535.net/ogg/block.html

Although there are specifiation docs, that give very
detailed information about single aspects of the encoding (or decoding) ,
I'm missing documenations that give a more general overview,
about how the encoder works.
(Vorbis Illuminated seems a bit outdated, as well as on2)

Here is a brief description of encoding process (as I understood it)

WINDOWING
- Vorbis uses overlapping windows with sizes between 64 and 8192 Samples (powers
of two)
- short blocks and one long blocks are used (short blocks must be smaller or
equal to long blocks), can be set to any allowed size
- selected window size depends on bitrate

MDCT
- transforms audio data to frequency domain

PSYCHOACOUSTIC MODEL
- Vorbis uses its own psychoacoustic model
- FFT for tonal analys and MDCT for noise analysis

Floor
- a psychoacoustic floor is created from the data, given from the
psychacoustic model
- the floor is a spectral envelope and represents a low resolution
model of the audio spectrum
- floor type 0 uses LSP and floortyp 1 a linear interpolation algorithm
to compute the floor curve
? currently only floor type 1 is used
? don't know whether the MDCT input for the psychoacoustic model come from 
MDCT
above or an extra MDCT is performed (would that make sense at all ?)
- the floor data are then subtracted (amplitude-wise) from the MDCT data
creating a "residue"
- the residue represents the spectral fine structure of the audio signal

CHANNEL COUPLING
- channel coupling reduces the redundacy of left and right channel
- it works good, because there's a high correlation between the floor curves
of both channels
- Vorbis has different types of stereo models: dual stereo, lossless stereo (- q
6 to -q 10),
phase stereo and a mixed stereo (all the modes together)
? although vorbis supports up to 255 channels, there's no channel coupling 
in streams
more than 2 channels (yet)
? not sure about the position of channel coupling in the diagram

VECTOR QUANTIZATION
- the floor data and the residues are vector quantized by using
custom codebooks
- codebooks are adaptive and are "trained"

HUFFMAN
- the vector - codewords are then huffman-coded to minimize redundancy

finally the data are then packed into a bitstream

Please correct or comment the diagram and the description.

I'm not skilled in C , so I can't "read" the sourcecode.
But I tried to get the information from the specs,
and the mailing lists was also helpful.

I need information about vorbis for my diploma thesis.

Thanks a lot

Stoffke

<p>--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Linus Walleij

2003-Mar-13 11:25 UTC

head link

[vorbis] encoder block diagram

ons 2003-03-12 klockan 23.53 skrev stoffke@directbox.com:
> I need information about vorbis for my diploma thesis.
Nice block diagram. Will you allow your thesis to be published on the
Vorbis website when it's finished? In that case you'll probably get more
help, also questions like these should be directed to the vorbis-dev
list I think.

Linus

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Seemingly Similar Threads

Search for more maybe matching threads

Vorbis - Mar 2003 - encoder block diagram

[vorbis] encoder block diagram

[vorbis] encoder block diagram

[vorbis] encoder block diagram

Seemingly Similar Threads