ogg.k.ogg.k@googlemail.com
2008-Feb-07 05:29 UTC
[ogg-dev] Ogg/Kate preliminary documentation
Hi, I recognize the main name behind CMML here :) Does the redesigning of CMML allow overlapping clips ? This is the main reason of my current ramblings about seeking. While karaoke was one of the initial goals behind kate, it is just a way the format can be used with (in fact, the format itself does not refer to karaoke at all, but styles and motions). At the moment, it is a fairly versatile text based presentation/video composition that can be superimposed onto a video (or not). This includes overlapping events, and animations, which I think CMML doesn't do (and which I didn't see it do, due to it being metadata about other data, and thus not "changing" in itself). But it seems to be changing goals, based on docs Conrad sent me links to. About requirements from Ogg itself, I don't think there is anything really - I've just listened to Ralph's talk about seeking in Ogg, and what I'm getting from it is "well, though, a multiple seek is required", so that's just the way it is. Otherwise, everything sits just fine within Ogg. You're quite right about CSS too. It's awfully complicated (but then I originally thought the same about CMML including HTML bits), and I made a simpler (and of course much less versatile) style setup. I wonder if this is a mistake. I didn't know you could time CSS changes however, unless you just refer to replacing the text by another text with the same actual visible string but css markers placed differently ? Anyway, seeking is the only bit that's really left now, it all works fine (from the encoder/decoder lib point of view anyway), and I didn't hit any showstoppers from ogg, hence no extra requirements. What did you have in mind when you referred to those issues that came up when you designed CMML ? As a bit of background, so you understand what I'm talking about, one of the things I can do with kate is to have (eg): - a (TV station ?) logo in a corner - a scrolling transcript of what is being said - a digital time in another corner - a channel number in a third - and whatever else you'd need all at the same time on screen, optionally moving around. It's not really showable now because you need a patched version of xine to actually view a video with the kate bitstream rendered on top of it, but you get the idea. Unless the changes to CMML are geared towards allowing this kind of things too ? Thanks
On Feb 8, 2008 12:28 AM, ogg.k.ogg.k@googlemail.com < ogg.k.ogg.k@googlemail.com> wrote:> Hi, > > I recognize the main name behind CMML here :) > > Does the redesigning of CMML allow overlapping clips ? > This is the main reason of my current ramblings about seeking.> > While karaoke was one of the initial goals behind kate, it is just > a way the format can be used with (in fact, the format itself does > not refer to karaoke at all, but styles and motions). > At the moment, it is a fairly versatile text based presentation/video > composition that can be superimposed onto a video (or not). > This includes overlapping events, and animations, which I think > CMML doesn't do (and which I didn't see it do, due to it being > metadata about other data, and thus not "changing" in itself). But > it seems to be changing goals, based on docs Conrad sent me > links to. > > About requirements from Ogg itself, I don't think there is anything > really - I've just listened to Ralph's talk about seeking in Ogg, and > what I'm getting from it is "well, though, a multiple seek is required", > so that's just the way it is. Otherwise, everything sits just fine within > Ogg. > > You're quite right about CSS too. It's awfully complicated (but then > I originally thought the same about CMML including HTML bits), > and I made a simpler (and of course much less versatile) style setup. > I wonder if this is a mistake. > > I didn't know you could time CSS changes however, unless you just > refer to replacing the text by another text with the same actual visible > string but css markers placed differently ? > > Anyway, seeking is the only bit that's really left now, it all works fine > (from the encoder/decoder lib point of view anyway), and I didn't hit > any showstoppers from ogg, hence no extra requirements. What did > you have in mind when you referred to those issues that came up when > you designed CMML ? > > As a bit of background, so you understand what I'm talking about, one > of the things I can do with kate is to have (eg): > - a (TV station ?) logo in a corner > - a scrolling transcript of what is being said > - a digital time in another corner > - a channel number in a third > - and whatever else you'd need > all at the same time on screen, optionally moving around. > > It's not really showable now because you need a patched version of xine > to actually view a video with the kate bitstream rendered on top of it, > but > you get the idea. > Unless the changes to CMML are geared towards allowing this kind of > things too ?Some of the things you talk about were not solved at the CMML level, but rather through using different Ogg logical bitstreams. For example: * the logo in a corner would be a logical bitstream of a timed image track - such as Ogg MNG * the transcript would be in a CMML logical bitstream * the digital time could be either part of that CMML logical bitstream or a separate CMML logical bitstrem; if it was part of the same CMML track, it would be positioned through CSS * same for channel number * overlapping timed text pieces would be coming in through differnt logical bistreams or the CSS (there may be a timing extension necessary to CSS to do so - if you have found a better way of doing this, I'll be keen to see) The advantage of having things in different logical bitstreams is that you can create addressing schemes can refer to just a subset of logical bitstreams if you e.g. only want some part of the composition delivered to you from a server. For example, http://example.org/video.ogx?track=video,audio,transcript will avoid giving you the digital time,logo, and channel number tracks for the above example. The CMML design has always focused on trying to keep things in components that can easily be added or taken away. I'm very keen to seeing your specifications and seeing kate at work - it may well be that you have found some better solutions to some of the problems that we attack differently with CMML and thus we should think about picking the best designs. Really wanting to see it working - post your specs and the patched vlc version here if you can! BTW: on the kate wiki page, Annodex is mentioned - what annodex is is simly a Ogg file with skeleton and a CMML track in addition to other digital media. It's a term that we used to specify the particular multiplexed file with which we wanted to work, but it hasn't really much meaning in itself nowadays. Cheers, Silvia. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.xiph.org/pipermail/ogg-dev/attachments/20080208/b2098a24/attachment.html
ogg.k.ogg.k@googlemail.com
2008-Feb-08 03:12 UTC
[ogg-dev] Ogg/Kate preliminary documentation
> Some of the things you talk about were not solved at the CMML level, but > rather through using different Ogg > logical bitstreams.While this is possible to do it this way (and probably a good idea for the examples like a clock in a corner), it implies that all the placements and logically different "items" are known at the start of the stream (since the Ogg spec says a stream can't start midway through another stream, an interesting restriction, but which is there nonetheless). While this is fine for a file based stream, it is not if the stream is generated in realtime. While it is not used at the moment, I do have a "category" field in the ID header, meant to be a tag used by a player to know what is supplied by a particular stream (eg, the user may want to select a number of categories, such as "transcript" and "commentary", and a language, and two streams would be displayed by the player. However, forcing the use of several separate streams, while having the advantage of keeping things simple (and being the solution I selected for multiple languages), may be overly restrictive.> * overlapping timed text pieces would be coming in through differnt logical > bistreams or the CSS (there may be a timing extension necessary to CSS to do > so - if you have found a better way of doing this, I'll be keen to see)Not a better solution, I'm afraid, merely a different one. You define regions and (very simple) styles, and there is a system of "motions" (mostly splines) that can alter attributes like color, position, etc. It's another custom scheme I'm afraid, but one which is kept simple and powerful I believe (hope ?).> The advantage of having things in different logical bitstreams is that you > can create addressing schemes can refer to just a subset of logical > bitstreams if you e.g. only want some part of the composition delivered to > you from a server. For example, > http://example.org/video.ogx?track=video,audio,transcript will avoid giving > you the digital time,logo, and channel number tracks for the above example. > The CMML design has always focused on trying to keep things in components > that can easily be added or taken away.This is a very good point, and the real point of Annodex, if I'm not mistaken (addressability of audio/video content) ? Kate does not attempt to deal with this, it's totally outside its scope. I understand that CMML does this for non CMML streams anyway (eg, Theora) ?> I'm very keen to seeing your specifications and seeing kate at work - it may > well be that you have found some better solutions to some of the problems > that we attack differently with CMML and thus we should think about picking > the best designs. Really wanting to see it working - post your specs and the > patched vlc version here if you can!I'll send you a recent snapshot, feel free to take inspiration from it, but I've only worked on it for about a month now, so don't expect to see much you haven't solved yet :) I do not have a patch for vlc, only MPlayer and xine (MPlayer does only text subtitles, but xine does all). As for specs, since the bitstream format is still in flux (and the API to a lesser extent), there are no docs yet. The wiki page is all there is for the moment.> BTW: on the kate wiki page, Annodex is mentioned - what annodex is is simly > a Ogg file with skeleton and a CMML track in addition to other digital > media. It's a term that we used to specify the particular multiplexed file > with which we wanted to work, but it hasn't really much meaning in itself > nowadays.Yes, I've noticed that very much of the code (in xine, say) was shared to decode Ogg and Annodex streams.
On Feb 8, 2008 10:01 PM, ogg.k.ogg.k@googlemail.com < ogg.k.ogg.k@googlemail.com> wrote:> > > can create addressing schemes can refer to just a subset of logical > > bitstreams if you e.g. only want some part of the composition delivered > to > > you from a server. For example, > > http://example.org/video.ogx?track=video,audio,transcript will avoid > giving > > you the digital time,logo, and channel number tracks for the above > example. > > The CMML design has always focused on trying to keep things in > components > > that can easily be added or taken away. > > This is a very good point, and the real point of Annodex, if I'm not > mistaken > (addressability of audio/video content) ? Kate does not attempt to deal > with > this, it's totally outside its scope. I understand that CMML does this for > non > CMML streams anyway (eg, Theora) ?CMML provides addressability only on the annotation track for any file that it is used in - so, yes, also in a Theora file, but not without having a CMML bitstream. Cheers, Silvia. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.xiph.org/pipermail/ogg-dev/attachments/20080211/58b88434/attachment.htm