Philip Jägenstedt
2003-Aug-09 15:40 UTC
[theora-dev] First steps towards a simple text stream format.
Hello everyone! This list may not be entirely appropriate discussion, but in the lack of ogg@xiph.org or ogg-dev@xiph.org this will have to do. I've been thinking for a few weeks that Ogg needs a simple text stream (read subtitle) format to go along with theora. This is important, because otherwise I can't transcode fellowship of the rings while keeping the elvish-speek, unless I render the text onto the video frame, and that's not cool. As you can see, the world will end if there is not a subtitle format for Ogg soon. This is what I've come up with. Goals: To create a generic text stream format which is flexible enough to be used for subtitles or lyrics, but doesn't attempt to do more than it should. The idea is that this format is made to be accepted by Xiph. I could pretend that this is just a text stream, but if it were just a text stream, all it would do is deliver a string of text at a given time. This format needs to do at least one thing more, namely specify the duration of that string, or it couldn't be used for subtitles. This brings up the question of why it shouldn't do text decorations (bold, italics, underlined), colors, sizes or even deliver the fonts themselves. If you want to see a subtitle format that does some of this and much more see USF (Universal Subtitle Format). Certainly one could embrace USF and try to fit it into Ogg, but IMHO, it simply does more than a subtitle format should. There can even be images and stuff in USF, and there's plans for implementing rotations/transforms. Now this sounds a bit similar to something else, namely SVG (USF is also XML-based BTW). I'm thinking that if anyone wants anything more complex than a text stream, they would use the mng-in-ogg stream that someone's working on, or in the future SVG (which natively has a way to deliver fonts, and could do all that USF does and more) could be baked into an ogg stream. But that's not what this format is about, just the reason why it won't do stuff that some people will inevitably ask for. Design features: * Text streams must be encoded in UTF-8. Most current simple subtitle formats don't specify an encoding at all. * Text streams must specify a language in the format specified by rfc-3066. This is so that a player application may select the stream which best fits the users locale, or perhaps load a different font better suited for rendering the given language. rfc-3066 is cool because it's not just limited to "real" languages -- you can specify Klingon subtitles! * Each text stream must specify a description, to let a user select between several. For subtitles this might be the language, e.g. "English", "English for people who can't hear quite well", "Svenska". In UTF-8. * A vorbis-comment block where whatever comments can be stored. Tools: It's quite clear that the actual implementation of the format outlined here wouldn't be 10 years of work. However, a good toolchain to actually use the format would probably take alot of time. If anyone's tried extracting a subtitle stream from a DVD using transcode, subtitle2pgm, gocr, ispell and a couple more tools, you know it isn't exactly inuitive. Hence, it wouldn't hurt to have a (graphical) tool which could convert dvd subtitles into this magic format in a manner which is more intuitive. Anyway, I hope that in the years to come some people will actually create multimedia content with vorbis+theora from the start, so that it's not simply used for backing up DVDs. In other words, more generic tools to author subtitles from scratch need to exist. However, all of this is far into the future and not the focus of my immediate concern. There is however one problem which I don't know what to do about: How to pack the text strings? What the SRT subtitles in ogm does is have a separate page for each subtitle. This is a simple solution, but it means that the overhead of the subtitles will probably be over 50%, and that ain't cool. The problem is of course that the subtitles can be very far apart in time, so if they are all lumped together into comfortable chunks, they'd only display if you play the file right through without seeking (becuase if you seek past where a subtitle begins, but within its duration, you're never going to see it). What possible solutions might there be to this? Having the player application seek through the entire file in the beginning and keep all subs in memory is a solution, but not a good one. Also, even if each subtitle has its own page, you're still not going to see it if you seek into the middle of its duration. Any comments/insights are welcome. // Philip Jägenstedt --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'theora-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
GODA-XEN@terra.es
2003-Aug-09 16:15 UTC
archives/Re: [theora-dev] First steps towards a simple text stream format.
I like this and I work in a subtitle format ( I don´t have anything now, its only a draft of desirables specifications ), But I decided to don't use UTF8, intested, I in the work to use a type of compresed utf8 in other words, this format is similar to utf8 in some way. my idea: 00000000-01111111 ->englsih characters, similar to utf8 1x... -> indexed utf in a table this save space in a subtitle in any language and isn't dificult to parse it, in a double byte language we can save sapace using this, i don' use a compresed all character for litle question, control of subtitle, for the controll we use the english characterset and they are always presents. the other part is the use of a endebed font, only the part use is coded as png or a vectorial format, but png is more easy to parse in a low power machines. --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'theora-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
Silvia.Pfeiffer@csiro.au
2003-Aug-09 16:33 UTC
[theora-dev] First steps towards a simple text stream format.
Hi Philip, you might want to check out www.annodex.net . Annodex defines an annotation bitstream for ogg files, where annotations appear time-synchronously to the clips you are annotating (just like captions). Annodex does not do text formatting, but as it uses XML, html-like formatting could be incorporated. Annodex has multiple annotation tracks so you can do several languages in parallel. It also has meta tags for structured comments (so vorbis comments could be fitted into there). In addition it has a hyperlink concept incorporated where you can attach a URI to a clip. So, it goes a bit beyond what you are asking, but as it is simple, you might find it useful anyway. It shouldn't be very difficult to implement a dvd2annodex program using Theora - in fact, that's a great idea! Cheers, Silvia. -----Original Message----- From: Philip Jägenstedt To: theora-dev@xiph.org Sent: 8/10/2003 8:40 AM Subject: [theora-dev] First steps towards a simple text stream format. Hello everyone! This list may not be entirely appropriate discussion, but in the lack of ogg@xiph.org or ogg-dev@xiph.org this will have to do. I've been thinking for a few weeks that Ogg needs a simple text stream (read subtitle) format to go along with theora. This is important, because otherwise I can't transcode fellowship of the rings while keeping the elvish-speek, unless I render the text onto the video frame, and that's not cool. As you can see, the world will end if there is not a subtitle format for Ogg soon. This is what I've come up with. Goals: To create a generic text stream format which is flexible enough to be used for subtitles or lyrics, but doesn't attempt to do more than it should. The idea is that this format is made to be accepted by Xiph. I could pretend that this is just a text stream, but if it were just a text stream, all it would do is deliver a string of text at a given time. This format needs to do at least one thing more, namely specify the duration of that string, or it couldn't be used for subtitles. This brings up the question of why it shouldn't do text decorations (bold, italics, underlined), colors, sizes or even deliver the fonts themselves. If you want to see a subtitle format that does some of this and much more see USF (Universal Subtitle Format). Certainly one could embrace USF and try to fit it into Ogg, but IMHO, it simply does more than a subtitle format should. There can even be images and stuff in USF, and there's plans for implementing rotations/transforms. Now this sounds a bit similar to something else, namely SVG (USF is also XML-based BTW). I'm thinking that if anyone wants anything more complex than a text stream, they would use the mng-in-ogg stream that someone's working on, or in the future SVG (which natively has a way to deliver fonts, and could do all that USF does and more) could be baked into an ogg stream. But that's not what this format is about, just the reason why it won't do stuff that some people will inevitably ask for. Design features: * Text streams must be encoded in UTF-8. Most current simple subtitle formats don't specify an encoding at all. * Text streams must specify a language in the format specified by rfc-3066. This is so that a player application may select the stream which best fits the users locale, or perhaps load a different font better suited for rendering the given language. rfc-3066 is cool because it's not just limited to "real" languages -- you can specify Klingon subtitles! * Each text stream must specify a description, to let a user select between several. For subtitles this might be the language, e.g. "English", "English for people who can't hear quite well", "Svenska". In UTF-8. * A vorbis-comment block where whatever comments can be stored. Tools: It's quite clear that the actual implementation of the format outlined here wouldn't be 10 years of work. However, a good toolchain to actually use the format would probably take alot of time. If anyone's tried extracting a subtitle stream from a DVD using transcode, subtitle2pgm, gocr, ispell and a couple more tools, you know it isn't exactly inuitive. Hence, it wouldn't hurt to have a (graphical) tool which could convert dvd subtitles into this magic format in a manner which is more intuitive. Anyway, I hope that in the years to come some people will actually create multimedia content with vorbis+theora from the start, so that it's not simply used for backing up DVDs. In other words, more generic tools to author subtitles from scratch need to exist. However, all of this is far into the future and not the focus of my immediate concern. There is however one problem which I don't know what to do about: How to pack the text strings? What the SRT subtitles in ogm does is have a separate page for each subtitle. This is a simple solution, but it means that the overhead of the subtitles will probably be over 50%, and that ain't cool. The problem is of course that the subtitles can be very far apart in time, so if they are all lumped together into comfortable chunks, they'd only display if you play the file right through without seeking (becuase if you seek past where a subtitle begins, but within its duration, you're never going to see it). What possible solutions might there be to this? Having the player application seek through the entire file in the beginning and keep all subs in memory is a solution, but not a good one. Also, even if each subtitle has its own page, you're still not going to see it if you seek into the middle of its duration. Any comments/insights are welcome. // Philip Jägenstedt --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'theora-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered. --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'theora-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.