Adam Rosi-Kessel
2009-Jul-14 11:09 UTC
[ogg-dev] Fixing ogg vorbis corruption caused by bad metadata
Monty Montgomery wrote, on 7/14/2009 1:44 AM:> On Tue, Jul 14, 2009 at 1:41 AM, Erik de Castro > Lopo<mle+la at mega-nerd.com> wrote: >> Monty Montgomery wrote: >> >>> Yes. Without the first three packets (which hold all the codec >>> settings and all the instruction how to handle the subsequent packets) >>> the rest of the stream is gibberish. Vorbis can't even unpack the >>> bits without the codebooks packed into the third header. >> Curiosity man here. >> >> Is there a finite set of predetermined codebooks or is the codebook >> source dependent? > > Both-- The current encoders use predetermined books, but its a large > set. And the set has often changed between releases. > > The set is small enough that its possible to make good reasonable guesses.So if I understand correctly -- the first packet is just OggS, so that's easy to replace. The second packet is the metadata, which we can lose. It's just the third packet that needs to be reconstructed. After that, you could start at any packet division in the rest of the file and it would play fine? So this generic restore tool that I'm positing would just need to try every known codebook set until a valid file was produced? How hard would that be? What I'm puzzled by, based on the information above, is why my attempted fixes haven't worked: pick a valid ogg vorbis file that I believe was ripped with the same tool around the same time, swap in the first three packets from that file, modify the serial number to match the new file. Thus far this technique hasn't actually fixed any files. Adam
ogg.k.ogg.k at googlemail.com
2009-Jul-14 11:16 UTC
[ogg-dev] Fixing ogg vorbis corruption caused by bad metadata
> So if I understand correctly -- the first packet is just OggS, so that'sNo. The first packet is the ID header, containing information such as bitrate, sample rate, number of channels, etc. OggS is the ogg level page capture pattern. Pages encapsulate packets.> easy to replace. The second packet is the metadata, which we can lose. > It's just the third packet that needs to be reconstructed. After that, > you could start at any packet division in the rest of the file and it > would play fine? So this generic restore tool that I'm positing would > just need to try every known codebook set until a valid file was > produced? How hard would that be?You'd need to renumber and regenerate CRCs. rogg might do all that (a set of tools somewhere on svn.xiph.org).> What I'm puzzled by, based on the information above, is why my attempted > fixes haven't worked: pick a valid ogg vorbis file that I believe was > ripped with the same tool around the same time, swap in the first three > packets from that file, modify the serial number to match the new file. > Thus far this technique hasn't actually fixed any files.CRCs ? If they're wrong (and they'd become wrong if you fiddle with the serial number without regenerating them), libogg will ignore the page.
Adam Rosi-Kessel
2009-Jul-14 11:55 UTC
[ogg-dev] Fixing ogg vorbis corruption caused by bad metadata
ogg.k.ogg.k at googlemail.com wrote, on 7/14/2009 7:16 AM:>> easy to replace. The second packet is the metadata, which we can lose. >> It's just the third packet that needs to be reconstructed. After that, >> you could start at any packet division in the rest of the file and it >> would play fine? So this generic restore tool that I'm positing would >> just need to try every known codebook set until a valid file was >> produced? How hard would that be? > You'd need to renumber and regenerate CRCs. rogg might do all that > (a set of tools somewhere on svn.xiph.org). >> What I'm puzzled by, based on the information above, is why my attempted >> fixes haven't worked: pick a valid ogg vorbis file that I believe was >> ripped with the same tool around the same time, swap in the first three >> packets from that file, modify the serial number to match the new file. >> Thus far this technique hasn't actually fixed any files. > CRCs ? If they're wrong (and they'd become wrong if you fiddle with the > serial number without regenerating them), libogg will ignore the page.Thank you! I'd never heard of rogg. That definitely gets me one step closer to a solution. Taking a file where I had substituted the header from a good ogg into the "bad" ogg, and then using rogg to fix the crc, I now have a playable file. What's odd is it starts out, for about a second, as audible and normal, and then the audio is scrambled -- perhaps a bitrate issue? Here's a sample of the result of this process: http://adam.rosi-kessel.org/bugs/liboggz/484/fixed_with_rogg.ogg So I think this might just need a little more tweaking to be turned into an automated fix script. Any ideas based on that file exactly what is wrong and how to correct it? Adam