There has been some light talk of http adaptive streaming support for
html5 but I wanted to ensure that is making progress and get feedback
from Mozilla video tag developers and theora community members before
pushing the html5 spec to include the necessary information.
I think we will want to target near zero server side configuration. The
server should just have to support normal http byte-range request
serving, and host multiple bit-rate files. The adaptive streaming should
be supported by multiple <source> children each with a given
"bitrate".
And a attribute on the parent "video" tag that expresses if the player
should automatically switch bitrates.
Something like:
<video adaptiveSource="true">
<source src="http://cdn1.cat.com/myVid.256k.ogg"
bitrate="256k"
codecs="theora, vorbis"/>
<source src="http://cdn2.cat.com/myVid.386k.ogg"
bitrate="386k"
codecs="theora, vorbis"/>
<source src="http://cdn1.cat.com/myVidSD.ogg"
bitrate="512k"
codecs="theora, vorbis"/>
<source src="http://cdn3.cat.com/myVidHD_480P.ogg"
bitrate="1024k"
codecs="theora, vorbis"/>
<source src="http://cdn5.cat.com/myVidHD_720p.ogg"
bitrate="2048k"
codecs="theora, vorbis"/>
<source src="http://cdn5.cat.com/myVidHD_1080p.ogg"
bitrate="4096k"
codecs="theora, vorbis"/>
</video>
All the logic of bitrate switching would be implemented in the browser.
The video element would also have to support some events like
"onswichsource" with the source element in the callback so that user
interfaces could be updated as the stream switched sources if necessary.
~ not necessary for the whatwg ~
Implementation wise it could work by dynamically switch streams based on
how fast the next key-frame byte offset chunk was downloaded. Byte
offset chunks would be defined as the byte rage between two key frames.
This would be very dependent on the ogg index support so that frame
accurate seeks in switching chunks would happen seamlessly.
Ideally content providers would encode each stream with the same
keyframe settings resulting in similar time offsets for keyframe
switching. But it would not be required, and the system would obviously
have to seek to the right time not just some random keyframe ;)
When the browser first starts buffering the clip it would target a
default speed based on previous stored client download rate. It would
then measure how fast it downloaded a given byte offset chunk and switch
up or down. If the download rate supported the bitrate of a higher
quality stream it would download the oggIndex and byteoffset chunk a few
"byte offset chunks" ahead. Or if a chunk was downloaded slower than
necessary to maintain playback it would switch downward. Optimizations
could include cashing the byte offsets of the stream below and stream
above on initial load.
--michael