thr3ads.net - ogg dev - [ogg-dev] Skeletal relations [Feb 2008]

If this information is useful, please help other people find it:
Share via:

Shane Stephens

2008-Feb-20 12:57 UTC

[ogg-dev] Skeletal relations

Hi again :)

On Wed, Feb 20, 2008 at 9:01 PM, ogg.k.ogg.k@googlemail.com <
ogg.k.ogg.k@googlemail.com> wrote:
> > 1) Font data, as in the actual font itself, doesn't really belong
in an
> ogg
> > stream.
>
> People wanting to  have more control on the appearance of an overlay
> might want to control the font. Since font naming is largely non standard
> (eg, the foundry etc system (you know, *-*-*-* system) is X only I think,
> and I think Windows just has filenames), one can't specify a font to
use
> in a way that you know will work. Provided you'd have the font in the
> first place. That's (one of ?) the reason various document types can
have
> embedded fonts. Yes, it's not ideal, but there's no real other way
to do
> it that I know of.

If you have an application-specific need for exact font data, then I think
the mechanism for retrieving this data should lie in your application, and
not in the media format that you're using for media data.  I would have said
the same thing to Adobe if they'd asked me ;-)

> > 2) We have been working on a specification and mechanism for
indicating
> to
> > clients that there are multiple tracks of the same "kind"
(e.g.
> > translation), and allowing clients to request individual tracks out of
> sets
> > of like tracks.  In fact with HTTP headers like Content-Language we
can
> also
> > allow the server to default to a particular translation selection in
the
> > absence of guidance from the client.  At the moment I think a
> preliminary
> > name for this specification is ROE - Silvia is in the process of
nailing
> the
> > spec down so you should ask her any questions you have about it :)
> > Obviously this doesn't "solve" the duplication issue (if
there is one)
> but
> > it does prevent duplicated data eating bandwidth.
>
> In this case, it's realtime muxing. That's a special case. While it
> probably does
> help in a lot of situations, it doesn't apply in all cases where one
could
> use
> an Ogg stream. It's a great help though.
> Besides, when I coded the xine Kate plugin, I've made it so you can
switch
> languages on the fly. All streams are decoded, but only the selected one
> (if any) is actually displayed.  This is not possible with such a scheme
> (not
> saying it's deficient, just that it also adds constraints).

Realtime vs. prerolled do not present problems here. ROE is agnostic as to
whether the media data is generated on the fly or stored on disk - it merely
needs a handle on the data.  Furthermore, with Annodex and ROE we will
absolutely be able to switch tracks on the fly - that's part of the point of
time-based URI values.  Furthermore we can do so without the overhead of
transmitting and decoding multiple (text / audio / video / whatever) tracks
at a time.  This is not a special case that we haven't considered :)

>
> > 3) Text is cheap!  Really cheap :)  Seriously - compare the amount of
> space
> > in your file taken up by text to that taken up even by audio, let
alone
> > video.
>
> Yes, text is cheap, but not fonts. Especially if they have to be burst
> transmitted
> in headers before playback actually begins. It's also a corner case
> (custom
> font + lots of multiplexed streams). but I'd ideally like to have
> something that
> scales if possible.
> Speaking of scaling, one of the issues that I've seen is the large
amount
> of
> framing data against codec data. Since I have a packet per page (for
> timing
> reasons), Ogg adds a lot of bytes to mine. But that is another story...
>
Fonts are not cheap.  That's true.  Again, though, I think that a
font-retrieval mechanism is outside the scope of Ogg, and should be dealt
with in an application-specific way (we would all obviously prefer a
consistent font naming scheme over all operating systems, but that's not
going to happen).

By the way, how would you deal with fonts if you didn't burst transmit
them?!  You can hardly decode them as a stream...

Cheers,
    -Shane
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.xiph.org/pipermail/ogg-dev/attachments/20080221/b99a0f28/attachment.htm

ogg.k.ogg.k@googlemail.com

2008-Feb-21 02:01 UTC

head link

[ogg-dev] Skeletal relations

> If you have an application-specific need for exact font data, then I think
> the mechanism for retrieving this data should lie in your application, and
> not in the media format that you're using for media data. I would have
said
> the same thing to Adobe if they'd asked me ;-)
We may or may not be talking about the same specific thing. My point was
not that there should be a shared font codec or similar, but that it might be
useful to consider a way to supply arbitrary data to multiplexed streams, each
codec then interpreting this data as it sees fit (possibly ignoring it). Now, I
kind agree with your points too.
> Realtime vs. prerolled do not present problems here. ROE is agnostic as to
> whether the media data is generated on the fly or stored on disk - it
merely
> needs a handle on the data. Furthermore, with Annodex and ROE we will
> absolutely be able to switch tracks on the fly - that's part of the
point of
> time-based URI values. Furthermore we can do so without the overhead of
> transmitting and decoding multiple (text / audio / video / whatever) tracks
> at a time. This is not a special case that we haven't considered :)
If you do that, it implies that you have to modify the stream at runtime (eg, as
the vorbis streamers do - cache the headers, and send them at startup. To be
generic, you also have to resend any data pages needed for correct
interpretation
of the stream (eg, whatever since the backlink).
If you do that, fine, but it's extra work that's outwith the scope of
the codec (eg,
it needs extra code to do this). And it's generic, so shouldn't really
have anything
to depend on Annodex and/or ROE.
> Fonts are not cheap. That's true. Again, though, I think that a
> font-retrieval mechanism is outside the scope of Ogg, and should be dealt
> with in an application-specific way (we would all obviously prefer a
Yes, I agree with this, the font thing was just an example.
> By the way, how would you deal with fonts if you didn't burst transmit
> them?! You can hardly decode them as a stream...
I didn't have any particular method in mind, just didn't want to assume
burst
transmitting was the only option. Thinking about this now, you could send each
character as it is used, keeping track of what characters are already sent.
No claim that it's a clever thing to do though :)

Shane Stephens

2008-Feb-21 03:13 UTC

head link

[ogg-dev] Skeletal relations

On Thu, Feb 21, 2008 at 9:00 PM, ogg.k.ogg.k@googlemail.com <
ogg.k.ogg.k@googlemail.com> wrote:
> > If you have an application-specific need for exact font data, then I
> think
> > the mechanism for retrieving this data should lie in your application,
> and
> > not in the media format that you're using for media data. I would
have
> said
> > the same thing to Adobe if they'd asked me ;-)
>
> We may or may not be talking about the same specific thing. My point was
> not that there should be a shared font codec or similar, but that it might
> be
> useful to consider a way to supply arbitrary data to multiplexed streams,
> each
> codec then interpreting this data as it sees fit (possibly ignoring it).
> Now, I
> kind agree with your points too.
>
I guess I'm a little concerned about putting arbitrary data in Ogg, period.
My point was that regardless of codec or name resolution technology used,
font downloading (like downloading of any other non-media-related data)
should be done out-of-band.

Of course you could always stuff your up-front data into the vorbis comment
packet...

>
> > Realtime vs. prerolled do not present problems here. ROE is agnostic
as
> to
> > whether the media data is generated on the fly or stored on disk - it
> merely
> > needs a handle on the data. Furthermore, with Annodex and ROE we will
> > absolutely be able to switch tracks on the fly - that's part of
the
> point of
> > time-based URI values. Furthermore we can do so without the overhead
of
> > transmitting and decoding multiple (text / audio / video / whatever)
> tracks
> > at a time. This is not a special case that we haven't considered
:)
>
> If you do that, it implies that you have to modify the stream at runtime
> (eg, as
> the vorbis streamers do - cache the headers, and send them at startup. To
> be
> generic, you also have to resend any data pages needed for correct
> interpretation
> of the stream (eg, whatever since the backlink).
> If you do that, fine, but it's extra work that's outwith the scope
of
> the codec (eg,
> it needs extra code to do this). And it's generic, so shouldn't
really
> have anything
> to depend on Annodex and/or ROE.
>
Modify which stream?  Also, this is very much building on the capabilities
and properties of Ogg and Annodex.

Assume we have a set of resources {A,B,C,D,E} on disk, in an ogg file.  A
ROE-capable player requests an initial subset {A,C,E} from a ROE-enabled
player, using one of a number of mechanisms.  The server streams just A, C
and E to the player (and here we have Ogg-specific requirement #1: we need
to be able to extract subsets of streams from a multiplexed stream, quickly
and with minimal server effort.  Ogg allows this, some other codecs
don't.).  At some time T, the player decides to switch {C -> D}.  The
player
makes a request to the server using a ROE/Annodex URI for {A,D,E} with time
offset T, and stops playing {A,C,E}.  Here of course we have the second
Ogg-specific requirement - at this stage no-one else supports URIs with time
offsets.  By using these, we've avoided modification of the original file on
disk, avoided vorbis-streamer-style special chained streams, and the only
extension required to mod-annodex is selection of a subset of streams (this
is something we were planning on adding anyway).

>
> > Fonts are not cheap. That's true. Again, though, I think that a
> > font-retrieval mechanism is outside the scope of Ogg, and should be
> dealt
> > with in an application-specific way (we would all obviously prefer a
>
> Yes, I agree with this, the font thing was just an example.
>
> > By the way, how would you deal with fonts if you didn't burst
transmit
> > them?! You can hardly decode them as a stream...
>
> I didn't have any particular method in mind, just didn't want to
assume
> burst
> transmitting was the only option. Thinking about this now, you could send
> each
> character as it is used, keeping track of what characters are already
> sent.
> No claim that it's a clever thing to do though :)
>

Clever: yes.  Sensible: probably not :)

Cheers,
    -Shane
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.xiph.org/pipermail/ogg-dev/attachments/20080221/331fded0/attachment-0001.htm

Ralph Giles

2008-Feb-25 17:30 UTC

head link

[ogg-dev] Skeletal relations

On 20-Feb-08, at 12:57 PM, Shane Stephens wrote:
> If you have an application-specific need for exact font data, then  
> I think the mechanism for retrieving this data should lie in your  
> application, and not in the media format that you're using for  
> media data.  I would have said the same thing to Adobe if they'd  
> asked me ;-)
"Fonts are hard."

I think this belongs as something media-mapping specific. Most vector  
formats have some way of dealing with fonts. In general they're  
either included directly or there's some established scheme of  
referencing external resources, possibly with intelligent  
substitution. But if you're putting SVG in a stream, you want to use  
the SVG mechanism, etc.

I'd also mention that alphabetic fonts are cheap, but ideographic  
(and cursive) fonts aren't, on the scale of web video.

  -r

Reasonably Related Threads

Search for more seemingly similar threads

ogg dev - Feb 2008 - Skeletal relations

[ogg-dev] Skeletal relations

[ogg-dev] Skeletal relations

[ogg-dev] Skeletal relations

[ogg-dev] Skeletal relations

Reasonably Related Threads