thr3ads.net - Vorbis dev - [vorbis-dev] Vorbis packet #3, codebooks and their large size [Nov 2000]

If this information is useful, please help other people find it:
Share via:

Dave, Anish

2000-Nov-09 18:26 UTC

[vorbis-dev] Vorbis packet #3, codebooks and their large size

Hi,

Am I correct in understanding that the codebooks are *not* adaptive during
compression?  I see that Packet #3 is written to the stream in the beginning
of the encode process with no modification.  If the codebooks are not
adaptive, then why are codebooks included in the stream at all?  Why not
pass the mode type (A or B or C...) instead of all the mode info and let the
decoder load it's own mode_A/B/C... tables.  Was it done for patent reasons?

The reason I ask is that vorbis becomes useless for small audio samples like
speech because there is always that ~10K overhead for packet 3.  With
hundreds of samples in memory (or disk), it becomes an issue.

Would we be breaking any patents if we modified the encoder and decoder to
pass just the mode type instead of the all the mode information (time_param,
floor_param, res_param, mapping_param)?

In the future (version 1.0) would the codebooks still be included in the
vorbis stream or will they eventually be part of both the encoder and the
decoder?

Thanks.
---------------------------
Anish Dave
Univ. of Waterloo
Comp Sci.

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Michael Smith

2000-Nov-09 18:55 UTC

head link

[vorbis-dev] Vorbis packet #3, codebooks and their large size

At 06:26 PM 11/9/00 -0800, you wrote:>Hi,
>
>Am I correct in understanding that the codebooks are *not* adaptive during
>compression?  I see that Packet #3 is written to the stream in the beginning
>of the encode process with no modification.  If the codebooks are not
>adaptive, then why are codebooks included in the stream at all?  Why not
>pass the mode type (A or B or C...) instead of all the mode info and let the
>decoder load it's own mode_A/B/C... tables.  Was it done for patent
reasons?
As far as I know it wasn't done for patent reasons (that certainly
wasn't
the main motivation, anyway). Although the codebooks are encoded as-is into
the stream by the current encoder, the format is designed so that they
could, possibly, be adaptively created (by a two-pass encoder, for instance).

The codebooks have changed completely with each beta - but the current
decoder can, because of this, decode any file since the file format was
finalised (some time before beta 1). This is why it's done this way, mostly
- as things become more polished, new codebooks go in, but backwards
compatibility remains. However, see below for alternatives.
>
>The reason I ask is that vorbis becomes useless for small audio samples like
>speech because there is always that ~10K overhead for packet 3.  With
>hundreds of samples in memory (or disk), it becomes an issue.
Yes, it's a significant overhead for small samples. This is a known issue.
Is it possible (and probably will be done in the future) to build smaller
codebooks for this type of situation (might also be worthwhile for
streaming purposes). There's still going to be a non-zero overhead, but
that's always going to happen with a flexible format. I'm not sure what
(realistically) the minimum size you can get for decent sound is, but it
should be MUCH smaller than the current books.
>
>Would we be breaking any patents if we modified the encoder and decoder to
>pass just the mode type instead of the all the mode information (time_param,
>floor_param, res_param, mapping_param)?
No, you wouldn't be breaking patents, to my knowledge (but I'm not
knowledgable on patent issues).
>
>In the future (version 1.0) would the codebooks still be included in the
>vorbis stream or will they eventually be part of both the encoder and the
>decoder?
Absolutely, for the reasons I mentioned above. The format isn't going to
change (well, I suppose it will change for channel coupling).

However, it is possible (and I think Monty has mentioned this in the past,
but it'd be a post-1.0 thing, I suppose) to create a seperate vorbis
mapping which uses a fixed set of codebooks (similar to how mp3, etc.
work). This would lose you flexibility (rather alot of flexibility), but
would be more appropriate to your uses. This is possibly the avenue you
want to pursue.

Michael

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Mercier, Dave

2000-Nov-09 19:06 UTC

head link

[vorbis-dev] Vorbis packet #3, codebooks and their large size

Perhaps instead of having Vorbis work with codebooks inside the stream or a
standard set of codebooks in the decoder, Vorbis could simply support both?
When V1.0 is all nice and polished, and some good codebooks have been tuned,
perhaps they could be included as part of the decoder - maybe it doesn't
even have to include the full range either - the lower bit rate ones are
probably more important in this case. 

I think this would really help out for applications dealing with smaller
samples. Of course it may complicate things, so I guess it's purely a design
decision.

Thanks,
Dave.

-----Original Message-----
From: Michael Smith [mailto:msmith@labyrinth.net.au]
Sent: Thursday, November 09, 2000 6:56 PM
To: vorbis-dev@xiph.org
Subject: Re: [vorbis-dev] Vorbis packet #3, codebooks and their large
size

However, it is possible (and I think Monty has mentioned this in the past,
but it'd be a post-1.0 thing, I suppose) to create a seperate vorbis
mapping which uses a fixed set of codebooks (similar to how mp3, etc.
work). This would lose you flexibility (rather alot of flexibility), but
would be more appropriate to your uses. This is possibly the avenue you
want to pursue.

Michael

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Michael Smith

2000-Nov-12 01:49 UTC

head link

[vorbis-dev] Vorbis packet #3, codebooks and their large size

At 06:26 PM 11/9/00 -0800, you wrote:>Hi,
>
>Am I correct in understanding that the codebooks are *not* adaptive during
>compression?  I see that Packet #3 is written to the stream in the beginning
>of the encode process with no modification.  If the codebooks are not
>adaptive, then why are codebooks included in the stream at all?  Why not
>pass the mode type (A or B or C...) instead of all the mode info and let the
>decoder load it's own mode_A/B/C... tables.  Was it done for patent
reasons?
>
>The reason I ask is that vorbis becomes useless for small audio samples like
>speech because there is always that ~10K overhead for packet 3.  With
>hundreds of samples in memory (or disk), it becomes an issue.
>
>Would we be breaking any patents if we modified the encoder and decoder to
>pass just the mode type instead of the all the mode information (time_param,
>floor_param, res_param, mapping_param)?
>
>In the future (version 1.0) would the codebooks still be included in the
>vorbis stream or will they eventually be part of both the encoder and the
>decoder?
>
Some additional thoughts...

For your purposes, the first three packets (main header, comments,
codebooks) will probably always be identical (assuming you don't use the
comment header - you might, in which case things change slightly, but not
significantly), or at least one of a small set of headers (for different
sample rates, number of channels, etc.). 

What you could do is to cache the headers (after removing the ogg
encapsulation) in memory. Then, on decode, you could feed the cached
headers to the decode engine first, then the remainder of the file. Since
the encoder flushes the output stream (forcing a new page) after the
headers, this is very simple.

For on-disk storage, you could either use the full file, as normal, or just
have the headers already chopped off (making it not a complete vorbis file,
but still usable). This latter approach would only be possible if you're
sure that the headers will always be the same - it would also simplify
caching logic.

Basically, this approach means a fairly trivial amount of pre-processing
when you read the file in, and then things work as you want - essentially,
codebooks are static and unchanging. 

Michael

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Reasonably Related Threads

Search for more seemingly similar threads

Vorbis dev - Nov 2000 - Vorbis packet #3, codebooks and their large size

[vorbis-dev] Vorbis packet #3, codebooks and their large size

[vorbis-dev] Vorbis packet #3, codebooks and their large size

[vorbis-dev] Vorbis packet #3, codebooks and their large size

[vorbis-dev] Vorbis packet #3, codebooks and their large size

Reasonably Related Threads