thr3ads.net - Vorbis dev - [vorbis-dev] A new introduction attempt. [Sep 2003]

If this information is useful, please help other people find it:
Share via:

Richard Felton

2003-Sep-10 04:08 UTC

[vorbis-dev] A new introduction attempt.

I have been using libvorbis for the past few weeks and have been asked to 
summarise what I have discovered about the codec. There is an early draft 
of the document at 
http://www.geocities.com/gatewaystation/vorbis/vorbis.htm  - Please forgive 
the dodgy formatting (it was formerly a MS word document that got converted 
with their 'save as html' feature). I still have some additions to make
but
I have stalled for the moment so thought it a good time to get some feedback.

There will probably be a PDF version circulated here at the BBC but I 
intend to produce a proper html version (not using Word!) for the web which 
can be read by anyone (hopefully they will also provide 
updates/corrections). Can you give me some feedback on anything I have got 
wrong? Any suggestions for more content are also welcome. Hopefully my 
document won't be as frustrating as 'Vorbis Illuminated' which is
still
floating about the web as the only attempt at an introduction to the whole 
process.

Thanks,

Rich.

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Lourens Veen

2003-Sep-10 07:39 UTC

head link

[vorbis-dev] A new introduction attempt.

On Wed 10 September 2003 13:08, Richard Felton wrote:> I have been using libvorbis for the past few weeks and have been
> asked to summarise what I have discovered about the codec. There
> is an early draft of the document at
> http://www.geocities.com/gatewaystation/vorbis/vorbis.htm  -
Firstly, it may be a good idea to make it clear that what you are 
documenting is the Xiph.org Vorbis reference codec. I could in 
theory write an encoder that outputs valid Vorbis data yet works in 
a wholly different way.

Secondly, in the diagram the MDCT and FFT appear before the 
psychoacoustic stage, while in the text they are part of it. I 
think the diagram is right, because transforming data into the 
frequency domain doesn't have much to do with human hearing, 
instead it is done because it yields data that can be more 
effectively compressed by vector quantisation. So the 
psychoacoustics header should be two paragraphs down and the text 
should be adapted accordingly.

In the vector quantising explanation, I would change the middle 
three paragraphs to something like the following:

---
Each point falls into a section and we could transmit the relevant 
section number for each point. Since we are sending only a one 
digit number rather than the entire vector, we achieve compression. 
The decoder will have a codebook, which holds a vector for each 
section, and use it to look up a vector for each section number it 
receives. Ofcourse, since all original vectors within a section are 
eventually decoded to the same vector from the codebook, some 
information is lost.

The design of a vector quantiser is a difficult task. Obviously we 
want to lose as little information as possible, so the decision 
boundaries and codebook vectors must be designed in such a way as 
to minimise the difference between the original vector and the 
decoded vector. This in turn depends upon the distribution of the 
input vectors, i.e. the input data of the encoder, and it is 
important that the codebook used works well with a wide variety of 
input data.

Vorbis extends the theory into more dimensions but this is difficult 
to convey graphically. An algorithm for codebook design (similar to 
the one used in Vorbis) can be found on the web at 
data-compression.com [5].

The encoder achieves further compression by encoding the indices 
using Huffman codes before sending them to the decoder.
---

As for the German article, the (German) online summary only mentions 
that there were 6000 entries, of which 3300 with the 64 
kbit-compressed data. Ogg Vorbis is clearly the best at 64kbit, 
while at 128kbit the differences are smaller, with most people 
being unable to distinguish between RealAudio, WMA, MP3Pro and MP3.

Lastly, perhaps it would be possible to generate a call graph of the 
encoder somehow? It would be nice to have a graphical 
representation of what uses what. Or maybe a more clear link 
between the source files and the blocks in the block diagram, so 
that it's easy to see which part of the functionality is 
implemented where.

Cheers,

Lourens
-- 
GPG public key: http://home.student.utwente.nl/l.e.veen/lourens.key

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Possibly Parallel Threads

Search for more possibly parallel threads

Vorbis dev - Sep 2003 - A new introduction attempt.

[vorbis-dev] A new introduction attempt.

[vorbis-dev] A new introduction attempt.

Possibly Parallel Threads