thr3ads.net - theora dev - [theora-dev] Extension to Skeleton for multi-track media [Mar 2010]

If this information is useful, please help other people find it:
Share via:

Silvia Pfeiffer

2010-Mar-23 05:43 UTC

[theora-dev] Extension to Skeleton for multi-track media

Hi all,

Discussions about a need for an extension to Skeleton to cater for
multi-track media files has been going on for a while. In a recent
thread here, in discussions on IRC, and at FOMS between Jan, Ralph,
Viktor and I, we discussed some fields. Viktor and I continued that
discussion to make more specific recommendations on what fields to
add.

We now have a wiki page at http://wiki.xiph.org/SkeletonHeaders that
has aggregated all the different things that were discussed.

"Language", "Role" and "Name" are fields that we
want to introduce to
better expose "semantic" information about the tracks. These will help
a media player make better choices on which tracks to display and
which to put into a selection to the user. These will also be
necessary for providing the (currently proposed) HTML5 JavaScript API
(http://www.w3.org/WAI/PF/HTML/wiki/Media_MultitrackAPI) with the
right information.

Note that right now there is no proposal for explicitly specifying
track dependencies, since the need for these are not really clear.

A further part of the wiki page is the proposal to impose an implicit
order on the tracks through the order in which their BOS pages are
given. This is nothing semantic, but only a convenience so we can
ascertain that different Web browsers will address the same track by
the same index number through JavaScript.

Finally there are two rendering related fields that we propose
introducing: Display-hint and Altitude (their names could of course
still be changed). Each of these are specified as features of the
given track, but relate to the other tracks for rendering. Altitude
specified the display ordering (as z-index in HTML/CSS), and
Display-hints right now has proposals for picture-in-picture display
relative the video's display area, for transparency to be applied to
all pixels of the track, for one color to be chosen as completely
transparent (as in green-screening), and for an image mask to be
applied. The image in the image mask would need to be either
referenced or encoded into another skeleton package - this isn't quite
solved yet.

We'd like to introduce these fields into a new version of Skeleton,
such that future formats will be able to include these fields. All of
these fields are optional, so there are no requirements really, just
additional functionality that a media player can make use of.

This email is to ask for input to the different proposals and for
suggestions of further improvements.

Cheers,
Silvia.

Benjamin M. Schwartz

2010-Mar-23 14:19 UTC

head link

[theora-dev] Extension to Skeleton for multi-track media

Silvia Pfeiffer wrote:> "Language", "Role" and "Name" are fields that
we want to introduce to
> better expose "semantic" information about the tracks.
These three are great.  Comments:
1. It is common for movies to list a series of languages, and it's not
always the case that one is dominant.  To accommodate this, we should
permit specifying the Language field multiple times, as allowed in RFC
2822.  The Javascript API should return an array of language codes.
Conventionally, the first language code should be the dominant one if
present.  A track with no language code should return an empty array.

2. Some of the roles are unclear.  It would be good to add clarifying
descriptions of their meaning and intended use.  For example, I don't know
the motivation or use for: text/activeregion, text/annotation,
text/transcript, text/linguistic, text/chapters, audio/music,
audio/speech, audio/sfx.  Also, video/alpha needs to specify how a
multichannel track (like Theora) can be rendered down to a single alpha
channel, for example by using the unmodified bytes of Y as alpha.

3. It seems that the name is meant to be only a semi-human-readable tag,
not a fully user-facing title.  Perhaps a localized Title field would be a
good addition at some point.
> A further part of the wiki page is the proposal to impose an implicit
> order on the tracks through the order in which their BOS pages are
> given. This is nothing semantic, but only a convenience so we can
> ascertain that different Web browsers will address the same track by
> the same index number through JavaScript.
I reiterate my preference for associative arrays, indexed by the Ogg track
ID and name.  The BOS ordering is unstable, and provides no benefit that I
can see over unique stream identifiers.
> Finally there are two rendering related fields that we propose
> introducing: Display-hint and Altitude (their names could of course
> still be changed).
Altitude seems fine.  I have more problems with Display-hint:

pip:
Specifying that a track can be shown as PIP might be a good thing.  This
mechanism seems very rigid, though.  Television sets that provide PIP
usually let the user control the positioning, because they may want to see
different parts of the underlying frame.  I'm not convinced that
specifying a position or size along with the PIP hint is necessary at all.
 If it is, the text should say "may be displayed" instead of
"should be
displayed" to indicate that the player should give the user control.
Content producers who want exact control of overlay positioning should use
Altitude and video/alpha.

Where are the zero coordinates of the display area?
If w and h are percentages, what are they percentages of?

2. mask:
Ogg files are self-contained.  This proposal breaks that in a huge way,
and I think it's terrible.  The right way to do this is in CSS in the
webpage, a la
http://labs.silverorange.com/files/video-demo/ambient.xhtml
http://webkit.org/blog/181/css-masks/

Please remove mask from the draft.

3. transparentcolor.
This will not work.  Lossy video codecs do not reproduce exact colors.  I
am not aware of any continuous-tone image or video coding system that
employs this approach, because it doesn't work.  Please remove it from the
draft.  People who want transparency will have to use the video/alpha system.

Further improvements:
As currently stated, the video/alpha label cannot actually be used to
blend multiple tracks together.  For example, if I want an exactly
controlled optional overlay, I would create 3 Theora tracks labeled as
video/main, video/alpha, and video/alternate (or maybe video/additional),
all the same size.  The altitude of the additional track would be higher
than the main, to indicate that it goes on top.  There are now at least
three possibilities:
1. The alpha track applies to the additional track.
2. The alpha track applies to the main track (before compositing)
3. The alpha track applies to the whole video (after compositing)

At present, there is no way to distinguish these cases, and the situation
is even more underspecified in the case of multiple additional tracks.  To
remedy this, I recommend an additional header field "Applies-to:
[name]".
 This indicates the name of the track to which a track applies.  For
example, a text track may apply to the to audio track of which it is a
transcription, and the video onto which it should be overlayed.  A
video/sign track Applies-to the audio track of which it is a translation.
 A video/alpha track Applies-to each track it is supposed to mask (before
compositing).

For video/alpha, this is still insufficient, because masking a video and
an overlay before compositing them is not the same as masking after
compositing.  To permit masking after compositing, video/alpha tracks
should optionally have one or more Altitudes.  For each Altitude held by a
video/alpha track, it applies to the composited result of all visible
higher tracks.

--Ben

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
Url :
http://lists.xiph.org/pipermail/theora-dev/attachments/20100323/09064747/attachment.pgp

Chris Pearce

2010-Apr-26 22:35 UTC

head link

[theora-dev] Extension to Skeleton for multi-track media

On 23/03/2010 6:43 p.m., Silvia Pfeiffer wrote:> Discussions about a need for an extension to Skeleton to cater for
> multi-track media files has been going on for a while. In a recent
> thread here, in discussions on IRC, and at FOMS between Jan, Ralph,
> Viktor and I, we discussed some fields. Viktor and I continued that
> discussion to make more specific recommendations on what fields to
> add.
>
> We now have a wiki page at http://wiki.xiph.org/SkeletonHeaders that
> has aggregated all the different things that were discussed.
>    
Other than the the message header fields at the wiki page above, is 
there a list somewhere of the fields you'd like to see added to the 
skeleton track? Do you want any of the above fields to be compulsory in 
the next version of Skeleton? It looks like all of those headers would 
be specified in the fisbone packet, which I don't need to change for 
indexing. They also look like they're variable length anyway, so 
including them as message fields makes perfect sense.

I'm trying to nail down a spec for the indexing-related stuff for a 
skeleton 4.0 track, and it would be great if we could produce a unified 
spec with all the changes we both want for a skeleton 4.0 track. :)


All the best,
Chris P.

Apparently Analagous Threads

Search for more seemingly similar threads

theora dev - Mar 2010 - Extension to Skeleton for multi-track media

[theora-dev] Extension to Skeleton for multi-track media

[theora-dev] Extension to Skeleton for multi-track media

[theora-dev] Extension to Skeleton for multi-track media

Apparently Analagous Threads