Ralph Giles
2009-Jul-14 17:00 UTC
[ogg-dev] Fixing ogg vorbis corruption caused by bad metadata
On Tue, Jul 14, 2009 at 9:48 AM, Adam Rosi-Kessel<adam at rosi-kessel.org> wrote:> The only issue I'm noticing is ogginfo reports: > > Warning: sequence number gap in stream 1. Got page 14 when expecting > page 2. Indicates missing data. > Warning: discontinuity in stream (1)I'd guess this is flagging the data that was overwritten by the bad tagging code. Some audio is missing, but 12 pages isn't enough to be noticeable in most files.> This doesn't seem to interfere with proper playing, but does anyone have > a suggestion for a simple fix here? I suppose I just need to renumber > the pages?Well, it's a warning for a reason; the file is still valid and will play, it's just telling you some data is missing compared to the original encode. Renumbering the pages (and fixing the crc's after the change) would remove the warning, if that's what you want. Cool that you've gotten them fixed! -r
Adam Rosi-Kessel
2009-Jul-14 18:12 UTC
[ogg-dev] Fixing ogg vorbis corruption caused by bad metadata
Ralph Giles wrote, on 7/14/2009 1:00 PM:> Well, it's a warning for a reason; the file is still valid and will > play, it's just telling you some data is missing compared to the > original encode. Renumbering the pages (and fixing the crc's after the > change) would remove the warning, if that's what you want.Does one of the rogg tools (or some other tool) renumber pages? Or would I have to do write something to do this? I assume the page number is just a byte in the packet header so would be easy enough to fix if no one has done this yet.> Cool that you've gotten them fixed!Yes, and I take it you are the rogg author, so thanks for saving my 1000+ files! For posterity, I will post my ultimate unified solution here -- I think it should be almost entirely automated. Adam
Ralph Giles
2009-Jul-14 20:06 UTC
[ogg-dev] Fixing ogg vorbis corruption caused by bad metadata
On Tue, Jul 14, 2009 at 11:12 AM, Adam Rosi-Kessel<adam at rosi-kessel.org> wrote:> Does one of the rogg tools (or some other tool) renumber pages? Or would > I have to do write something to do this? I assume the page number is > just a byte in the packet header so would be easy enough to fix if no > one has done this yet.There's nothing in the rogg scripts that could renumber pages, but it's a three-line change to rogg_serial.c. I wrote rogg to make it easy to write simple mutation scripts like that. Scary, that code has users now! :-) -r
Adam Rosi-Kessel
2009-Jul-15 18:34 UTC
[ogg-dev] Fixing ogg vorbis corruption caused by bad metadata
Adam Rosi-Kessel wrote, on 7/14/2009 2:12 PM:>> Cool that you've gotten them fixed! > Yes, and I take it you are the rogg author, so thanks for saving my > 1000+ files! For posterity, I will post my ultimate unified solution > here -- I think it should be almost entirely automated.So I've written a script to do the following: (1) Set the serial numbers on all packets for the "good" ogg and the "bad" ogg to the same (shelling out to rogg_serial) (2) String-scraping the "bad" ogg to extract the metadata such as artist, album, genre, etc. This metadata appears to be intact in all of my "bad" oggs. (3) Extracting all packets with granulepos 0 from the "good" ogg (4) Extracting all packets with granulepos >0 from the "bad" ogg (5) Concatenating the results of 3 & 4 (6) Putting the correct metadata back in with vorbiscomment This appears to work flawlessly with some files. For others, although the output is a valid ogg, the sound is scrambled. I'm assuming this is because I need to swap in a different "good" header for those oggs, presumably because they were, e.g., ripped with different settings. Any ideas on how to find the proper "good" header to go with each corrupted ogg? My first idea was to grep for the ripper/codec identification string (e.g., "Xiph .Org libVorbis I 20050304"), which sometimes works but not always. In other words, there are some bad files with "Xiph .Org libVorbis I 20050304" where a good file with that string in the header still generates distorted sound. Anything I could drill down to on the byte level in the headers to automate the matching process? The container documentation is pretty clear, but I'm not sure what I should be looking for at the codec level to try to make the match. Adam