Hi all, I want to make a static database containing hundreds of thousands of very short audio files, each having not more than 100 milliseconds. These are made by splitting larger audio files into tiny pieces. I encode all the little files separately, but do not store 3 vorbis header packets, which are the same for all the files. I do not use ogg stream, only store plain vorbis packets. When I encode the longer files, which altogether have the same audio length, I get 20% higher compression ratio. However, in order to have fast lookup times, I cannot deal with decoding such large files, only to get a little chunk in them. Do you know of a way to compress the little chunks, but achieve compression ratio of those primary audio files? Where does the difference in compression rate come from? Is it possible to get even higher compression rates without quality loss, by tweaking any parameters or blocksizes hardcoded in libvorbis source? Thanks, Micha?
On Mon, Mar 30, 2009 at 4:52 AM, Micha? Czuczman <mczuczman at ivosoftware.com> wrote:> Hi all, > > I want to make a static database containing hundreds of thousands of very > short audio files, each having not more than 100 milliseconds. These are made > by splitting larger audio files into tiny pieces. > > I encode all the little files separately, but do not store 3 vorbis header > packets, which are the same for all the files. I do not use ogg stream, only > store plain vorbis packets. > > When I encode the longer files, which altogether have the same audio length, > I get 20% higher compression ratio. However, in order to have fast lookup > times, I cannot deal with decoding such large files, only to get a little > chunk in them. > > Do you know of a way to compress the little chunks, but achieve compression > ratio of those primary audio files? Where does the difference in compression > rate come from? > > Is it possible to get even higher compression rates without quality loss, > by tweaking any parameters or blocksizes hardcoded in libvorbis source?What you are doing is fairly far outside of the design of Vorbis. You're talking about using chunks comparable to the Vorbis block length and Vorbis uses a 50% block overlap. So you're losing compression from the wasted overlap on each side, you're also leaving the psychoacoustic model unprimed. You could reduce the block lengths used by Vorbis, but you would not be satisfied with the performance. For these granule lengths you would have better luck by using CELT: http://celt-codec.org/ CELT also has overlapped blocks but the typical block size is much smaller, and the overlap is reduced. Although, since you are clearly processing the audio, you should be aware that lossy compression may just be generally unsuitable for your application as a lossy compressor makes assumptions about how the audio will be heard that your processing may violate.
On Mon, Mar 30, 2009 at 9:13 AM, Gregory Maxwell <gmaxwell at gmail.com> wrote: [snip]> about using chunks comparable to the Vorbis block length and Vorbis uses a 50% > block overlap.Before someone else corrects my non-standard usage: The blocks are overlapped 50% on each side, i.e. the entire block completely overlaps with other blocks.
The difference is most likely coming from the psychoacoustic model and with it not having time to properly adapt to the signal within such short periods of time.. At the start it's got to assume a fairly worst case of silence having been playing before the sound, and so the psychoacoustic model hasn't had time to adapt yet.. I'd be tempted to try a slightly lower quality setting, it's likely that because the psychoacoustics aren't having time to adapt with such short files that it's possibly over-encoding the audio files for a given quality due to their short durations.. Or possibly windowing related due to the shortness of the files ? On Mon, Mar 30, 2009 at 9:52 AM, Micha? Czuczman <mczuczman at ivosoftware.com> wrote:> Hi all, > > I want to make a static database containing hundreds of thousands of very > short audio files, each having not more than 100 milliseconds. These are made > by splitting larger audio files into tiny pieces. > > I encode all the little files separately, but do not store 3 vorbis header > packets, which are the same for all the files. I do not use ogg stream, only > store plain vorbis packets. > > When I encode the longer files, which altogether have the same audio length, > I get 20% higher compression ratio. However, in order to have fast lookup > times, I cannot deal with decoding such large files, only to get a little > chunk in them. > > Do you know of a way to compress the little chunks, but achieve compression > ratio of those primary audio files? Where does the difference in compression > rate come from? > > Is it possible to get even higher compression rates without quality loss, > by tweaking any parameters or blocksizes hardcoded in libvorbis source? > > Thanks, > > Micha? > _______________________________________________ > Vorbis mailing list > Vorbis at xiph.org > http://lists.xiph.org/mailman/listinfo/vorbis >