Robin Siegemund
2007-Feb-15 04:18 UTC
[theora-dev] How to do Theora playback efficiently ?
Dear theora developer community, currently I'm working on a simple Theora player for Windows. But the code in the player_example.c seems not to have the performance of other implementations like the Direct Show filters by illuminate. In the example player, all important things are done in one thread: decoding the next vorbis or theora packet(s) and reading from the physical stream (+ split the streams to a vorbis or theora logical stream) if more data is needed. This works for theora streams with low dynamic data rates but not for my test streams, where from time to time a single keyframe needs 20-30 OggPages of data. Normally you can handle this by prebuffering some already decoded audio sample buffers or video frames but even with buffers of some megabyte this is in practise no solution because the audio decoder has to wait till theora is ready with reading/preparing/decoding it's pages. And before you can see, if a theora frame has to be skipped (too late for decoding), you must do several pagein-calls and several other things first which also takes a lot of time. Typically a gapless audio playback has higher priority than smooth video playback so another implementation is needed. Because of this I tried to do this demuxing stuff (read from file, do the page_in calls) in a seperate thread. But without defining critical sections or other handling of synchronising threads, I get a lot off runtime errors (collission of ogg_stream and ogg_theora functions because of suddenly zero packets, etc.). If I synchonise the threads, it's working but I have a big slow down again... it works not efficiently. I don't understand why I have all these theora and ogg states when they are not capable of handle multiple threads. The only thing I do is: 1. Thread: Read data and putting pages in the right stream (vorbis or theora) 2. Thread: Get Packets from a stream and decode it's data So besides the ogg/theora states both threads are working on different variables/buffers BUT they collide. How useful is a state when I can't access it from multiple threads? The main problem is that Theora does a lot of things BEHIND or HIDDEN with all those states and packets and all seems so easy for the programmer. But it isn't, so sometimes I wish to do all the things on my own. But back to the problem. What I need is one page of audio data ready for preparing/decoding at any time, independent of what the theora decoder is doing. But how to get this vorbis page when a lot of theora pages are in between? These Theora pages have to be read first (otherwise they get lost) but the audio decoder should not need to wait for this. And worse case: The theora packet preparing/decoding is still working on a frame, so no more page_in calls for the theora streams are possible (or you get thos collissions). I think, an ideal solution are two "pointers" in seperate threads reading independently the physical source stream and grab their pages (and prepare/decode them) but ignore the other ones and don't need to wait. So I need two ogg_sync_states or something like that? It must work in a way without all these ogg states because in the case of Direct Show the filters (ogg demuxer, vorbis decoder, theora decoder) are completely independent from propritary ogg states (they are not visible for others outside a COM object). It would be great when someone has some hints for me, how to make a good player. Ok, a good example code would be great, too. ;-) Regards, Robin Siegemund, dp Munich
On Thu, Feb 15, 2007 at 11:18:11AM +0100, Robin Siegemund wrote:> currently I'm working on a simple Theora player for Windows. But the code in > the player_example.c seems not to have the performance of other > implementations like the Direct Show filters by illuminate. In the example > player, all important things are done in one thread: decoding the next > vorbis or theora packet(s) and reading from the physical stream (+ split the > streams to a vorbis or theora logical stream) if more data is needed. > This works for theora streams with low dynamic data rates but not for my > test streams, where from time to time a single keyframe needs 20-30 OggPages > of data.Well, it works as long as you can decode both streams fast enough. But it fails suddenly as soon as the theora decoder takes long enough to starve the playback buffer. There's provision for dropping decoded frames (which doesn't help much if you have hardware YUV2RGB and scaling) but nothing for dropping compressed packets (about all you can do is stop decoding until the next keyframe anyway.> I don't understand why I have all these theora and ogg states when they are > not capable of handle multiple threads. The only thing I do is: > > 1. Thread: Read data and putting pages in the right stream (vorbis or > theora) > 2. Thread: Get Packets from a stream and decode it's dataYou still need locking between the ogg_stream_pagein() and ogg_stream_packout() calls on the the same ogg_stream structure. This should be light compared to the stream reading and decoding, and not slow things down.> The main problem is that Theora does a lot of things BEHIND or HIDDEN with > all those states and packets and all seems so easy for the programmer. But > it isn't, so sometimes I wish to do all the things on my own.I'm not sure what you mean here. libtheora assumes the buffer you give it with packetin() will remain valid until you call YUVout(). If you need more lookahead, you can build fifos of compressed packets so neither decoder can underrun, or fifos of uncompressed data. I believe this is what media frameworks like directshow and gstreamer do. Or, as you suggest, you can maintain multiple read pointers. Note that you can use the decoupling between ogg_stream_pagein() and ogg_stream_packetout() as a fifo.> I think, an ideal solution are two "pointers" in seperate threads reading > independently the physical source stream and grab their pages (and > prepare/decode them) but ignore the other ones and don't need to wait. So I > need two ogg_sync_states or something like that?Yes, you'd need two ogg_sync_states that way, and you'd just drop any pages you weren't interested in inside a given thread. http://svn.annodex.net/liboggplay/trunk/ is another example you might want to look at. I don't think it runs on windows yet though. Hope that helps, -r