I've developed an indexer which embeds a keyframe index track in Ogg files. It embeds the index in its own track, so that players that don't understand or don't want to use the index can just ignore it. Ogg needs this to make seeking over networks faster and more efficient. Currently we must do a bisection search when seeking, which usually takes aound 6 HTTP requests, give or take a few. If we are to compete with existing internet video (AKA the "YouTube" case) we need to do that faster and more efficiently - using fewer HTTP requests. We need an index so that seeking only takes 1 HTTP request. Using an index of keyframes also makes it easier to seek without visual artifacts - though of course robust players should be able to do that in the absence of an index. You can download the source code for my indexer here: http://github.com/cpearce/OggIndex My specification for the index track is stored in that repo, here: http://tinyurl.com/l4c9yg I'd appreciate comments on the spec... To see how it improves network seeking performance, you can download a version of Firefox which can take advantage indexes here: http://pearce.org.nz/video/firefox-indexed-ogg-seek.linux.tar.bz2 http://pearce.org.nz/video/firefox-indexed-ogg-seek.macosx.dmg http://pearce.org.nz/video/firefox-indexed-ogg-seek.win32.zip Then point that browser here: http://pearce.org.nz/video/indexed-seek-demo.html (There's a couple of other indexed ogg files in http://pearce.org.nz/video/ too) In terms of compatibility, currently the following players can play and seek in indexed files (i.e. aren't broken when playing/seeking indexed files): * VLC * Cortardo * XiphQT plugin The following players can play indexed files, but can't seek: * Anything that uses liboggz (including Firefox 3.5). * The DirectShow filters (do these use liboggz?) It only takes a pretty trivial patch to enable liboggz (and thus FF3.5) to seek in files with an index; liboggz refuses to seek because it doesn't understand metrics for the index track. It shouldn't be too hard to fix other players as well. Totem on Ubuntu9.04 refuses to play the indexed files, though it decodes the first few frames. That may require a gstreamer patch. So... what do you guys think?
Hi Chris, I love the idea and even more that you already have working code. I think this is probably the cleanest approach to Ogg seeking that I have seen in a while. I think the additional overhead is tolerable. Please note that there will be a third solution to the problem of seeking over HTTP which will not require Ogg files with keyframe index tracks, but can make use of them. The idea here is to use media fragment URIs to address into a media file and retrieve the relevant byte ranges. This is based on communicating it to the server and the server then composing together the relevant byte ranges as a reply. If the file on the server had your keyframe index track, it could probably identify the right byte range faster, so that's a good thing. I think this is awesome and should be added to the Xiph wiki. Also, it should go into ffmpeg2theora, IMHO, but of course that's up to others to decide. Cheers, Silvia. On Tue, Sep 22, 2009 at 1:27 PM, Chris Pearce <chris at pearce.org.nz> wrote:> I've developed an indexer which embeds a keyframe index track in Ogg > files. It embeds the index in its own track, so that players that don't > understand or don't want to use the index can just ignore it. > > Ogg needs this to make seeking over networks faster and more efficient. > > Currently we must do a bisection search when seeking, which usually > takes aound 6 HTTP requests, give or take a few. If we are to compete > with existing internet video (AKA the "YouTube" case) we need to do that > faster and more efficiently - using fewer HTTP requests. We need an > index so that seeking only takes 1 HTTP request. > > Using an index of keyframes also makes it easier to seek without visual > artifacts - though of course robust players should be able to do that in > the absence of an index. > > You can download the source code for my indexer here: > http://github.com/cpearce/OggIndex > > My specification for the index track is stored in that repo, here: > http://tinyurl.com/l4c9yg > > I'd appreciate comments on the spec... > > To see how it improves network seeking performance, you can download a > version of Firefox which can take advantage indexes here: > http://pearce.org.nz/video/firefox-indexed-ogg-seek.linux.tar.bz2 > http://pearce.org.nz/video/firefox-indexed-ogg-seek.macosx.dmg > http://pearce.org.nz/video/firefox-indexed-ogg-seek.win32.zip > > Then point that browser here: > http://pearce.org.nz/video/indexed-seek-demo.html > (There's a couple of other indexed ogg files in > http://pearce.org.nz/video/ too) > > In terms of compatibility, currently the following players can play and > seek in indexed files (i.e. aren't broken when playing/seeking indexed > files): > * VLC > * Cortardo > * XiphQT plugin > > The following players can play indexed files, but can't seek: > * Anything that uses liboggz (including Firefox 3.5). > * The DirectShow filters (do these use liboggz?) > > It only takes a pretty trivial patch to enable liboggz (and thus FF3.5) > to seek in files with an index; liboggz refuses to seek because it > doesn't understand metrics for the index track. It shouldn't be too hard > to fix other players as well. > > Totem on Ubuntu9.04 refuses to play the indexed files, though it decodes > the first few frames. That may require a gstreamer patch. > > So... what do you guys think? > > > _______________________________________________ > theora mailing list > theora at xiph.org > http://lists.xiph.org/mailman/listinfo/theora >
hi,> > I'd appreciate comments on the spec...> There can be only one index track per Ogg bitstream segment. The index > packet must occur before all non-metadata streams' content packets. In > practice this means that the index packet will occur along with other > secondary header pages, before the skeleton EOS page.would this also work for ogg streams with more than one video, not that this is common, but possible in theory. would in those cases an index not need to reference the track it is indexing? j
Below is another version of the index track spec with one index packet per stream. The index format is still quite simple, though not as compact as the previous "one merged index per file" approach. I estimate that if you index two tracks, assuming one key point every two seconds from both tracks, that in practice it will take approximately 70KB per hour of video (11.6KB per 10 minutes) to index two-track video. That's about 20 bytes of index per second of video. With the original "one merged index per file" approach it's about half that, but I think the added size is an acceptable trade off. I imagine the majority of video out there on the internet is under 10 minutes long anyway (requiring a 12KB index...), and when playing files over a network, most reasonable quality videos will require about 100KB/s of bandwidth to playback smoothly. If if you've got a connection fast enough for streaming video, you won't notice downloading an index. You can tweak the index-keyframe interval to reduce the index size as well, though that erodes the benefit of the index for network playback. I've implemented this in my indexer on a new branch on my GitHub account: http://github.com/cpearce/OggIndex/tree/index-per-stream New spec here: http://github.com/cpearce/OggIndex/blob/index-per-stream/IndexSpecificationVersion1.txt Firefox builds which can handle new index format here: https://build.mozilla.org/tryserver-builds/cpearce at mozilla.com-try-4768e6238638/ Demo here: http://pearce.org.nz/video/indexed-seek-demo.html New Proposed Index Track Format: <quote> An Ogg index track starts with an identifier header packet which contains the following data, in the following order: * The identifier "index\0". * The index version format number, as a 1 byte unsigned integer. This specification describes version 1, so this field should have the value 0x01. * The playback start time, in milliseconds, as an 8 byte unsigned integer, this is the presentation time of the first frame. * The playback end time, in milliseconds, as an 8 byte unsigned integer, this is the end time of the last frame. * The length of the indexed segment, in bytes, as an 8 byte unsigned integer. The track then contains secondary header packets, which contain the actual indexes. These are the "index packets", and each must begin on a new page, but they may span multiple pages. There is one index packet for each content stream in the Ogg segment, and they appear in increasing order of the streams' serialno. Each index packet contains the following: * The serialno of the stream as a 4 byte field. * The number of key points in the index packet, 'n', as a 4 byte unsigned integer. * 'n' key points, each of which contain, in the following order: - the page's byte offset as an 8 byte unsigned integer, followed by - the checksum of the page found at the offset, as a 4 byte field, followed by - the presentation time in milliseconds of the key point, as an 8 byte unsigned integer. The key points are stored in increasing order by offset. The presentation time of the key point is calculated from the granulepos. [...] The last packet in the track is an empty EOS packet, which must start on a new page. </quote> Note that this format can be encoded in one pass. If you know the duration of the media, you can decide the keyframe interval (say one every 2 seconds, which is roughly ffmpeg2theora's default for theora anyway) and then allocate the required space in the index packets and come back and fill it in once you've encoded the media. Comments? Questions etc? Chris P.
Back by popular demand, a new version of OggIndex, which encapsulates the keyframe index in the skeleton track. Available here: http://github.com/cpearce/OggIndex/tree/skeleton-index-per-stream I added a few fields to the skeleton ident header, and added "index" packets after the "fisbone" packets. I increased the skeleton version field to 3.1. Below is the skeleton track specification I used. Skeleton 3.1 Spec - Includes one "index" packet per stream. Skeleton header packet: 1. Identifier: 8 bytes, "fishead\0". 2. Version major: 2 Byte unsigned integer signifying the major (3) 3. Version minor: 2 Byte unsigned integer signifying the minor (1) 4. Presentationtime numerator: 8 Byte signed integer 5. Presentationtime denominator: 8 Byte signed integer 6. Basetime numerator: 8 Byte signed integer 7. Basetime denominator: 8 Byte signed integer 8. UTC [ISO8601]: a 20 Byte string containing a UTC time 9. [NEW] Start time, the presentation time in milliseconds of the first sample in the media. 8 byte signed integer, -1 if unknown. Note that samples between the Start time and the Presentationtime are not supposed to be shown. 10. [NEW] End time, the end time of the last sample in the media. 8 byte signed integer, -1 if unknown. 11. [NEW] The length of the segment, in bytes: 8 byte signed integer, -1 if unknown. Skeleton 'fisbone\0' packets as per skeleton track 3.0, one per stream. [NEW] Skeleton index packets. There should be one per content stream, coming after the fisbone packets, before the skeleton eos packet: 1. Identifier 6 bytes: "index\0" 2. The serialno of the stream as a 4 byte field. 3. The number of keypoints in the index packet, 'n' as a 4 byte unsigned integer. This can be 0. 4. 'n' key points, each of which contain, in the following order: - the page's byte offset as an 8 byte unsigned integer, followed by - the checksum of the page found at the offset, as a 4 byte field, followed by - the presentation time in milliseconds of the key point, as an 8 byte unsigned integer. Existing player compatibility with index-in-skeleton and index-in-index-track files: * Firefox 3.5.3 plays the index-in-skeleton files, purely by luck, whereas it can't get the duration of index-in-index-track files due to a bug in liboggz. Firefox3.5 uses liboggplay, which assumes any skeleton packet which doesn't have "fishead" magic bytes is a "fisbone" packet. It's possible that a skeleton index packet would break FF3.5 due to being parsed as a fisbone packet. So regardless of which approach we took, we'd have to patch Firefox 3.5.x. * The ogg DirectShow codecs work ok with index-in-skeleton files, but seeks in index-in-index-track files result in a seek to 0. * VLC - plays both index-in-skeleton and index-in-index-track files, but seeks to 0 in index-in-skeleton files result in the video window disappearing, but the media can still be played again after this happens. * XiphQT - plays and seeks (inside buffered ranges) with both types of files * Totem/gstreamer - with both index-in-skeleton and index-in-index-track files, Totem/gstreamer prompts for a missing plugin on load, but will still play the file. All seeks just reset playback position to 0. I assume gstreamer must be checking the skeleton version field, and/or prompting when it reads the packets/tracks with unrecognized magic bytes. Firefox index-in-skeleton capable builds available for download here: http://build.mozilla.org/tryserver-builds/cpearce at mozilla.com-try-b7b626658548 Indexed videos for testing available here: http://pearce.org.nz/video/ Chris P. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.xiph.org/pipermail/theora/attachments/20091021/62b2b58d/attachment.htm
On 12/2/2009 11:16 AM, Gregory Maxwell wrote:> Care to build some diabolically wrong test cases? > I think it's important to have a solid understand how apps fail with > broken indexes (and how they should behave). >Sure, I'll add it to my ToDo list.>> We're planning on shipping support for keyframe index assisted seek in >> Firefox 3.7, which will ship sometime next year. >> > Can you provide some more details on the timeline here? When do you expect > the code will be set in stone?I'm not entirely sure. I've heard Q2 mentioned, but it may be later. We don't like to take new features after we've started doing betas, so as long as it goes in before we start doing 3.7 betas, we can change the code up until release. We've probably got several months until then. Chris P.