Hi all,
Apologies in advance, this email is quite long.
I've prepared an updated Vorbis RTP Internet Draft, which is a
continuation of draft-moffitt-vorbis-rtp-00.txt which can be found
below.
If this new draft gets the ok I'd like to submit this to the AVT WG
later this week.
There are a number of changes over the original I-D, notably the
changing of the M bit function in the RTP header to match current AVT
practice, together with an initial suggestion for codebook delivery.
Other changes are the expansion of previous sections and the explicit
declarations of certain requirements.
For background you may want to read the minutes of the IETF meeting
where Jack presented the initial draft:
http://www.ietf.org/proceedings/01mar/ietf50-135.htm
In updating this I-D I've trawled the Vorbis-dev ML and I think I've
taken a reasonable consensus line, but comments and feedback is
required.
The biggest area is the transmission of the codebooks. I've taken the
apporach of transmitting them using RTCP, and added a checksum block for
integrity checking. There has been discussion on the AVT list a few
weeks ago concerning TCP over RTP and there are plans for an I-D to
cover this. I think this could be a better solution than to use RTSP
for codebook delivery, but I have nothing against RTSP, I wanted to keep
the focus of this I-D as close to that of RFC 1889. This approach fits
well with both unicast and multicast models, however comments and
alternative suggestions from others is most welcomed.
Another area where the Vorbis spec and current practices of the AVT WG
differ is the output channel order.
The output channel order in draft-ietf-avt-profile-new-12 I-D, sect 4.1,
defines the order as:
l left
r right
c center
S surround
F front
R rear
channels description channel
1 2 3 4 5 6
__________________________________________________
2 stereo l r
3 l r c
4 quadrophonic Fl Fr Rl Rr
4 l c r S
5 Fl Fr Fc Sl Sr
6 l lc c r rc S
<p><p>The 3, 5 and 6 channels layout do not match.
The I-D does state that the channel ordering SHOULD follow the AIFF-C
format, so we can use the Vorbis layout if we push it, but this may be
something we may want to review.
An entry for Vorbis RTP streams should be named in the Audio Encodings
table in section 4.5 of draft-ietf-avt-profile-new-12, together with a
MIME type. A draft MIME type document should be ready either later this
week, or just after the holidays.
Comments, feedback and minor flames welcomed.
Regards
Phil
<p>---------------------8<------------------------8<-----------------------
Network Working Group Phil Kerr
Internet-Draft The Ogg Vorbis Community
December 20, 2002 / OpenDrama
Expires: June 20, 2003
<p> RTP Payload Format for Vorbis Encoded Audio
<draft-kerr-avt-vorbis-rtp-00.txt>
Status of this Memo
This document is an Internet-Draft and is in full conformance
with all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as
Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other
documents at any time. It is inappropriate to use Internet-
Drafts as reference material or to cite them other than as
"work in progress".
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
The key words "MUST", "MUST NOT", "REQUIRED",
"SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED",
"MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [1].
Copyright Notice
Copyright (C) The Internet Society (2002). All Rights Reserved.
Abstract
This document describes a RTP payload format for transporting Vorbis
encoded audio.
Table of Contents
1. Introduction ........................................ x
2. Background .......................................... x
3. Payload Format ...................................... x
3.1 RTP Header .......................................... x
3.2 Payload Header ...................................... x
3.3 Payload Data ........................................ x
3.4 Example RTP Packet .................................. x
4. Frame Packetizing ................................... x
4.1 Example Fragmented Vorbis Packet .................... x
5. Codebooks ........................................... x
6. Security Considerations ............................. x
7. Acknowledgments ..................................... x
8. References .......................................... x
9. Full Copyright Statement ............................ x
10. Authors Address ..................................... x
1 Introduction
This document describes how Vorbis encoded audio may be formatted for
use as an RTP payload type.
2 Background
The Xiph.org Foundation creates and defines codecs for use in
multimedia that are not encumbered by patents and thus may be freely
implemented by any individual or organization.
Vorbis is the general purpose multi-channel audio codec created by
the Xiph.org Foundation.
Vorbis encoded audio is generally found within an Ogg format
bitstream, which provides framing and synchronization. For the
purposes of RTP transport, this layer is unnecessary, and so raw
Vorbis packets are used in the payload.
Vorbis packets are unbounded in length currently. At some future
point there will likely be a practical limit placed on packet
length.
Typical Vorbis packet sizes are from very small (2-3 bytes) to
quite large (8-12 kilobytes). The reference implementation [2]
seems to make every packet less than ~800 bytes, except for the
codebooks packet which are ~8-12 kilobytes.
Within a RTP context the maximum Vorbis packet SHOULD be kept below
the MTU size of 1500 octets, including the RTP and payload headers,
to avoid fragmentation.
3 Payload Format
The standard RTP header is followed by an 8 bit payload header, and
then the payload data.
3.1 RTP Header
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P|X| CC |M| PT | sequence number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| timestamp |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| synchronization source (SSRC) identifier |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
| contributing source (CSRC) identifiers |
| ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The RTP header begins with an octet of fields (V, P, X, and CC) to
support specialized RTP uses (see [4] and [5] for details). For
Vorbis RTP applications, V is set to 2, and the P, X, and CC fields
are set to 0.
Marker (M): 1 bit
Set to zero. Audio silence suppression not used. This conforms
to section 4.1 of [6].
Payload Type (PT): 7 bits
An RTP profile for a class of applications is expected to assign a
payload type for this format, or a dynamically allocated payload
type should be chosen which designates the payload as Vorbis.
Sequence number: 16 bits
The sequence number increments by one for each RTP data packet
sent, and may be used by the receiver to detect packet loss and
to restore packet sequence. This field is detailed further in
[3].
Timestamp: 32 bits
A timestamp representing the sampling time of the first sample of
the first Vorbis packet in the RTP packet. The clock frequency
MUST be set to the sample rate of the encoded audio data and is
conveyed out-of-band.
SSRC/CSRC identifiers:
These two fields, 32 bits each with one SSRC field and a maximum
of 16 CSRC field, are as defined in [3].
3.2 Payload Header
The first octet of the payload data is the payload header:
1 2 3 4 5 6 7 8
+---+---+---+---+---+---+---+---+
| C | F | R | # of packets |
+---+---+---+---+---+---+---+---+
C: 1 bit
Set to one if this is a continuation of a fragmented packet.
F: 1 bit
Set to one if the payload contains complete packets or if it
contains the last fragment of a fragmented packet.
R: 1 bit
Reserved, must be set to zero by senders, and ignored by
receivers.
The last 5 bits are the number of complete packets in this payload.
This provides for a maximum number of 32 Vorbis packets in the
payload. If C is set to one, this number should be 0.
3.3 Payload Data
If the payload contains a single Vorbis packet or a Vorbis packet
fragment, the Vorbis packet data follows the payload header.
For payloads which consist of multiple Vorbis packets, payload data
consists of one octet representing the packet length followed by the
packet data for each of the Vorbis packets in the payload.
The Vorbis packet length octet is the length minus one. A value of
0 means a length of 1.
The payload packing of the Vorbis data packets SHOULD follow the
guidelines set-out in section 4.4 of [6] where the oldest packet
occurs immediately after the RTP packet header.
3.4 Example RTP Packet
Here is an example RTP packet containing two Vorbis packets.
RTP Packet Header:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 8 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 2 |0|0| 0 |0| PT | sequence number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| timestamp (in sample rate units) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| synchronization source (SSRC) identifier |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
| contributing source (CSRC) identifiers |
| ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Payload Data:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0|0|0| # pks: 2| len | vorbis data ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ...vorbis data... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ... | len | next vorbis packet data... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
<p>4 Frame Packetizing
Each RTP packet contains either one complete Vorbis packet, one
Vorbis packet fragment, or an integer number of complete Vorbis
packets (upto a max of 32 packets, since the number of packets is
defined by a 5 bit value).
Any Vorbis packet that is larger than 256 octets and less than the
path-MTU should be placed in a RTP packet by itself.
Any Vorbis packet that is 256 bytes or less should be bundled in the
RTP packet with as many Vorbis packets as will fit, up to a maximum
of 32.
If a Vorbis packet will not fit into the RTP packet, it must be
fragmented. A fragmented packet has a zero in the last five bits
of the payload header. Each fragment after the first will also set
the Continued (C) bit to one in the payload header. The RTP packet
containing the last fragment of the Vorbis packet will have the
Marker (F) bit set to one.
4.1 Example Fragmented Vorbis Packet
Here is an example fragmented Vorbis packet split over three RTP
packets.
RTP packet header details have been excluded from this example.
Packet 1:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0|0|0| 0| len | vorbis data ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ...vorbis data... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The number of packets field is set to 0.
Packet 2:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|1|0|0| 0| len | vorbis data ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ...vorbis data... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The C bit is set to 1 and the number of packets field is set to 0.
For large Vorbis fragments there can be several of these type of
payload packets. The maximum packet size should be no greater
than the MTU of 1500 octets, including all RTP and payload headers.
Packet 3:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|1|1|0| 0| len | vorbis data ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ...vorbis data... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
This is the last Vorbis fragment packet. The C and F bits are
set and the packet count remains set to 0.
5 Codebooks
To decode a Vorbis stream, a set of codebooks is required. These
codebooks are allowed to change for each logical bitstream (for
example, for each song encoded in a radio stream).
The codebooks must be completely intact and a client can not decode
a stream with an incomplete or corrupted set.
A client connecting to a multicast RTP Vorbis session needs to get
the first set of codebooks in some manner. These codebooks are
typically between 4 kilobytes and 8 kilobytes in size.
On joining a session the first packet sent MUST be a Vorbis
codebook message.
When codebooks change a new set are sent as a SR just prior to
the Vorbis bitstream change as an APP defined RTCP message with
the 4 octet name field set to VORC. This is the same format as
the initial codebook packet.
Codebook RTCP packets MUST set the padding (P) flag and add the
appropriate padding octets needed to conform with section 6.6
of [3].
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P| subtype | PT=APP=204 | length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SSRC/CSRC |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| VORC |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| codebook checksum | codebook ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
A 16 bit 1's complement checksum of the codebook precedes the
codebook datablock. This checksum is used to detect a corrupted
codebook. If a checksum failure is detected an empty RR RTCP
message, of APP type with the 4 octet name field set to VORR, is
sent from the client. Transmission of the codebook back to the
client SHOULD be handled as an unicast delivery to prevent a
rogue client from generating an excessive number of codebook
requests within a multicast stream, however multicast transmission
of codebook request replies SHOULD be catered for at the application
level.
6 Security Considerations
RTP packets using this payload format are subject to the security
considerations discussed in the RTP specification [3]. This implies
that the confidentiality of the media stream is achieved by using
encryption. Because the data compression used with this payload
format is applied end-to-end, encryption may be performed on the
compressed data.
7 Acknowledgments
This I-D is a continuation of draft-moffitt-vorbis-rtp-00.txt.
Thanks to the Ogg Vorbis Community and to the Xiph.org team,
especially Jack Moffitt <jack@xiph.org>.
8 References
1. Key words for use in RFCs to Indicate Requirement Levels
(RFC 2119).
2. libvorbis: Available from the Xiph website, http://www.xiph.org
3. RTP: A Transport Protocol for Real-Time Applications (RFC 1889).
4. RTP: A transport protocol for real-time applications. Work
in progress, draft-ietf-avt-rtp-new-11.txt.
5. RTP Profile for Audio and Video Conferences with Minimal Control.
Work in progress, draft-ietf-avt-profile-new-12.txt.
9 Full Copyright Statement
Copyright (C) The Internet Society (2002). All Rights Reserved.
This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
<p>10 Authors Address
Phil Kerr
Centre for Music Technology
University of Glasgow
email: philkerr@elec.gla.ac.uk
WWW: http://www.xiph.org/
<p><p><p><p><p>--- >8 ----
List archives: http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body. No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.
I think we should find a more flexible way to define mapping of channels to different surround formats. how could i enocde left, right and LFE channel for 5.1? or l r c LFE or l r Sl Sr, etc... then we have 7.1, 6.1... and what's with ambisonics? how are surround channels automatically down-mixed to stereo or mono? Phil Kerr wrote:> Another area where the Vorbis spec and current practices of the AVT WG > differ is the output channel order. > > The output channel order in draft-ietf-avt-profile-new-12 I-D, sect 4.1, > defines the order as: > > l left > r right > c center > S surround > F front > R rear > > channels description channel > 1 2 3 4 5 6 > __________________________________________________ > 2 stereo l r > 3 l r c > 4 quadrophonic Fl Fr Rl Rr > 4 l c r S > 5 Fl Fr Fc Sl Sr > 6 l lc c r rc S > > > > The 3, 5 and 6 channels layout do not match. > > The I-D does state that the channel ordering SHOULD follow the AIFF-C > format, so we can use the Vorbis layout if we push it, but this may be > something we may want to review.--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
Hi all,
Below is a final draft of the updated Vorbis RTP Internet Draft which
I'll send to the IETF in a few days. The changes include:
Added IANA MIME type section
Redesigned setup, codebook and comment metadata packet, fixing bugs
Added SDP section
Added congestion section
Extended acknowledgments section
Various textual tweaks
Thanks to everyone who has contributed and of course feedback welcomed.
Cheers
Phil
-------------------8<------------------------8<-----------------
<p><p>Network Working Group Phil Kerr
Internet-Draft Ogg Vorbis Community /
February 20, 2003 OpenDrama
Expires: August 20, 2003
<p> RTP Payload Format for Vorbis Encoded Audio
<draft-kerr-avt-vorbis-rtp-01.txt>
Status of this Memo
This document is an Internet-Draft and is in full conformance
with all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as
Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other
documents at any time. It is inappropriate to use Internet-
Drafts as reference material or to cite them other than as
"work in progress".
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
The key words "MUST", "MUST NOT", "REQUIRED",
"SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED",
"MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [1].
Copyright Notice
Copyright (C) The Internet Society (2003). All Rights Reserved.
Abstract
This document describes a RTP payload format for transporting
Vorbis encoded audio. It details the encapsulation mechanism for
raw Vorbis data and details the delivery mechanism for the
decoder probability model, referred to as a codebook, and other
decoder setup information.
<p><p><p><p><p>Kerr Expires
August 20, 2003 [Page 1]
Internet Draft draft-kerr-avt-rtp-vorbis-01.txt February 20, 2003
<p>Table of Contents
1. Introduction ........................................ 2
2. Payload Format ...................................... 3
2.1 RTP Header .......................................... 3
2.2 Payload Header ...................................... 4
2.3 Payload Data ........................................ 4
2.4 Example RTP Packet .................................. 5
3. Frame Packetizing ................................... 5
3.1 Example Fragmented Vorbis Packet .................... 6
4. IANA Considerations ................................. 7
5. Configuration headers ............................... 7
6. Session Description ................................. 10
7. Congestion Control .................................. 10
8. Security Considerations ............................. 10
9. Acknowledgments ..................................... 10
10. References .......................................... 11
11. Full Copyright Statement ............................ 11
12. Authors Address ..................................... 12
1 Introduction
The Xiph.org Foundation creates and defines codecs for use in
multimedia that are not encumbered by patents and thus may be freely
implemented by any individual or organization.
Vorbis is the general purpose multi-channel audio codec created by
the Xiph.org Foundation.
Vorbis encoded audio is generally encapsulated within an Ogg format
bitstream, which provides framing and synchronization. For the
purposes of RTP transport, this layer is unnecessary, and so raw
Vorbis packets are used in the payload.
Vorbis packets are unbounded in length currently. At some future
point there will likely be a practical limit placed on packet
length.
Typical Vorbis packet sizes are from very small (2-3 bytes) to
quite large (8-12 kilobytes). The reference implementation [2]
typically produces packets less than ~800 bytes, except for the
header packets which are ~4-12 kilobytes.
Within a RTP context the maximum Vorbis packet SHOULD be kept below
the MTU size of 1500 octets, including the RTP and payload headers,
to avoid fragmentation. For the delivery of Vorbis audio using RTP
the maximum size of the header block is limited to 64K.
<p><p><p><p>Kerr Expires August
20, 2003 [Page 2]
Internet Draft draft-kerr-avt-rtp-vorbis-01.txt February 20, 2003
<p><p>2 Payload Format
The standard RTP header is followed by an 8 bit payload header,
then the payload data.
<p>2.1 RTP Header
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P|X| CC |M| PT | sequence number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| timestamp |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| synchronization source (SSRC) identifier |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
| contributing source (CSRC) identifiers |
| ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The RTP header begins with an octet of fields (V, P, X, and CC) to
support specialized RTP uses (see [4] and [5] for details). For
Vorbis RTP applications, V is set to 2, and the P, X, and CC fields
are set to 0.
Marker (M): 1 bit
Set to zero. Audio silence suppression not used. This conforms
to section 4.1 of [6].
Payload Type (PT): 7 bits
An RTP profile for a class of applications is expected to assign
a payload type for this format, or a dynamically allocated
payload type should be chosen which designates the payload as
Vorbis.
Sequence number: 16 bits
The sequence number increments by one for each RTP data packet
sent, and may be used by the receiver to detect packet loss and
to restore packet sequence. This field is detailed further in
[3].
Timestamp: 32 bits
A timestamp representing the sampling time of the first sample of
the first Vorbis packet in the RTP packet. The clock frequency
MUST be set to the sample rate of the encoded audio data and is
conveyed out-of-band.
SSRC/CSRC identifiers:
These two fields, 32 bits each with one SSRC field and a maximum
of 16 CSRC field, are as defined in [3].
Kerr Expires August 20, 2003 [Page 3]
Internet Draft draft-kerr-avt-rtp-vorbis-01.txt February 20, 2003
2.2 Payload Header
The first octet of the payload data is the payload header:
1 2 3 4 5 6 7 8
+---+---+---+---+---+---+---+---+
| C | F | R | # of packets |
+---+---+---+---+---+---+---+---+
C: 1 bit
Set to one if this is a continuation of a fragmented packet.
F: 1 bit
Set to one if the payload contains complete packets or if it
contains the last fragment of a fragmented packet.
R: 1 bit
Reserved, must be set to zero by senders, and ignored by
receivers.
The last 5 bits are the number of complete packets in this payload.
This provides for a maximum number of 32 Vorbis packets in the
payload. If C is set to one, this number should be 0.
2.3 Payload Data
If the payload contains a single Vorbis packet or a Vorbis packet
fragment, the Vorbis packet data follows the payload header.
For payloads which consist of multiple Vorbis packets, payload data
consists of one octet representing the packet length followed by the
packet data for each of the Vorbis packets in the payload.
The Vorbis packet length octet is the length of the data block
minus one.
The payload packing of the Vorbis data packets SHOULD follow the
guidelines set-out in section 4.4 of [5] where the oldest packet
occurs immediately after the RTP packet header.
Channel mapping of the audio is in accordance with BS. 775-1
ITU-R.
<p><p><p><p><p><p><p><p><p><p><p>Kerr
Expires August 20, 2003 [Page 4]
Internet Draft draft-kerr-avt-rtp-vorbis-01.txt February 20, 2003
2.4 Example RTP Packet
Here is an example RTP packet containing two Vorbis packets.
RTP Packet Header:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 2 |0|0| 0 |0| PT | sequence number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| timestamp (in sample rate units) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| synchronization source (SSRC) identifier |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
| contributing source (CSRC) identifiers |
| ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Payload Data:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0|1|0| # pks: 2| len | vorbis data ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ...vorbis data... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ... | len | next vorbis packet data... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
<p>3 Frame Packetizing
Each RTP packet contains either one complete Vorbis packet, one
Vorbis packet fragment, or an integer number of complete Vorbis
packets (upto a max of 32 packets, since the number of packets is
defined by a 5 bit value).
Any Vorbis packet that is larger than 256 octets and less than the
path-MTU should be placed in a RTP packet by itself.
Any Vorbis packet that is 256 bytes or less should be bundled in the
RTP packet with as many Vorbis packets as will fit, up to a maximum
of 32.
If a Vorbis packet will not fit into the RTP packet, it must be
fragmented. A fragmented packet has a zero in the last five bits
of the payload header. Each fragment after the first will also set
the Continued (C) bit to one in the payload header. The RTP packet
containing the last fragment of the Vorbis packet will have the
Marker (F) bit set to one.
<p>Kerr Expires August 20, 2003 [Page
5]
Internet Draft draft-kerr-avt-rtp-vorbis-01.txt February 20, 2003
<p>3.1 Example Fragmented Vorbis Packet
Here is an example fragmented Vorbis packet split over three RTP
packets.
RTP packet header details have been excluded from this example.
Packet 1:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0|0|0| 0| len | vorbis data ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ..vorbis data.. |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The number of packets field is set to 0.
Packet 2:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|1|0|0| 0| len | vorbis data ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ..vorbis data.. |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The C bit is set to 1 and the number of packets field is set to 0.
For large Vorbis fragments there can be several of these type of
payload packets. The maximum packet size should be no greater
than the MTU of 1500 octets, including all RTP and payload headers.
Packet 3:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|1|1|0| 0| len | vorbis data .. |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ..vorbis data.. |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
This is the last Vorbis fragment packet. The C and F bits are
set and the packet count remains set to 0.
<p><p><p><p><p><p>Kerr
Expires August 20, 2003 [Page 6]
Internet Draft draft-kerr-avt-rtp-vorbis-01.txt February 20, 2003
4 IANA Considerations
Media MIME type name: audio
Media MIME subtype name: vorbis
Required Parameters: none
Optional Parameters: none
<p>5 Configuration headers
To decode a Vorbis stream three configuration header information
blocks are needed. This data is sent out-of-band and is defined
below as an APP defined RTCP message with the 4 octet name field
set to VORB.
On joining a session the first packet sent back to the client
MUST be a Vorbis message containing the codec setup and codebook
data.
VORB RTCP packets MUST set the padding (P) flag and add the
appropriate padding octets needed to conform with section 6.6
of [3]. Synchronising the configuration headers to the RTP stream
is critical. A 32 bit timestamp field is used to indicate the
timepoint when a VORB header MUST be applied to the RTP stream.
VORB RTCP packets MUST be sent just ahead of the change in the RTP
stream.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P| subtype | PT=APP=204 | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SSRC/CSRC |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| VORB |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Timestamp (in sample rate units) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Vorbis Version |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Audio Sample Rate |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Bitrate Maximum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Bitrate Nominal |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Bitrate Minimum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| bsz 0 | bsz 1 | Num Audio Channels |c|m|o|x|x|x|x|x|
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
Kerr Expires August 20, 2003 [Page 7]
Internet Draft draft-kerr-avt-rtp-vorbis-01.txt February 20, 2003
<p> +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
| Codebook length | Codebook checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
.. Codebook |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
| Vendor string length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Vendor string ..
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| User comments list length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
.. User comment length / User comment |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
.. URI string |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The first Vorbis config header defines the Vorbis stream
attributes. The Vorbis version MUST be set to zero to comply with
this document. The fields Sample Rate up to Num Audio Channels
are set in accordance with [6] with the bsz fields above referring
to the blocksize parameters. The framing bit is not used for RTP
transportation and so applications constructing Vorbis files MUST
take care to set this if required.
The next 8 bits are used to indicate the presence of the two
other Vorbis stream config headers and the size overflow header.
The c flag indicates the presence of a Codebook header block, the
m flag indicates the presence of a comment metadata block. The o
flag indicates if the size of either of the c and m headers would
make the VORB packet greater than that allowed for a RTCP message.
The remaining five bits, indicated with an x, are reserved/unused
and MUST be set to 0.
If the c flag is set then the next header block will contain the
codebook configuration data. Unlike other mainstream audio codecs
Vorbis has no statically configured probability model instead it
packs all entropy decoding configuration, VQ and Huffman models
into a self-contained codebook. This codebook block also requires
additional identification information detailing the number of audio
channels, bit rates and other information used to initalise the
Vorbis stream.
This setup information MUST be completely intact and a client can
not decode a stream with an incomplete or corrupted codebook set.
<p><p><p><p><p>Kerr Expires
August 20, 2003 [Page 8]
Internet Draft draft-kerr-avt-rtp-vorbis-01.txt February 20, 2003
<p> A 16 bit codebook length field and a 16 bit 1's complement
checksum
of the codebook precedes the codebook datablock. The length field
allows for codebooks to be upto 64k in size. The checksum is used to
detect a corrupted codebook. If a checksum failure is detected then
a new config header file SHOULD be obtained from SDP. If no SDP
value is set and no other method for obtaining the config headers
exists then this is considered to be a failure and should be
reported to the client application.
If the m flag is set then the next header block will contain the
comment metadata, such as artist name, track title and so on. These
metadata messages are not intended to be fully descriptive but to
offer basic track/song information. This message MUST be sent at
the start of the stream, together with the setup and codebook
headers, even if it contains no information. During a session the
metadata associated with the stream may change from that specified
at the start, eg. a live concert broadcast changing acts/scenes, so
clients MUST have the ability to receive m header blocks. Details
on the format of the comments can be found in the Vorbis
documentation [7].
The format for the data takes the form of a 32 bit codec vendors
name length field followed by the name encoded in UTF-8. The next
field denotes the number of user comments and then the user comments
length and text field pairs upto the number indicated by the user
comment list length.
The framing bit is not used for RTP transportation and so
applications constructing Vorbis files MUST take care to set
this if required.
If the o, overflow, bit is set then the URI of a whole header block
is specified in an overflow URI field, which is a null terminated
UTF-8 string. The header file specified at the URI MUST NOT have
the overflow flag set, otherwise a loop condition will occur. If
SDP information is available then the URI value set there MUST take
precedent.
<p><p><p><p><p><p><p><p><p><p><p><p><p><p><p>Kerr
Expires August 20, 2003 [Page 9]
Internet Draft draft-kerr-avt-rtp-vorbis-01.txt February 20, 2003
<p>6 Session Description for Vorbis RTP Streams
Session description information concerning the Vorbis stream
SHOULD be provided if possible and must be in accordance with
[8]. The contents of the Vorbis Header file referred to in the
u attribute must contain all three of the config header blocks
as specified above. The overflow bit of the header packet must
not be set.
u=<URI of Vorbis header file>
m=audio <port> RTP/AVP 98
c=IN IP4 <URI of Vorbis stream>
a=rtpmap:98 vorbis/<sample rate>
The port value is specified by the server application bound to
the URI specified in the c attribute. The bitrate value specified
in the a attribute MUST match the Vorbis sample rate value.
7 Congestion Control
Vorbis clients SHOULD send regular receiver reports detailing
congestion. A mechanism for dynamically downgrading the stream,
known as bitrate peeling, will allow for a graceful backing off
of the stream bitrate. This feature is not available at present
so an alternative would be to redirect the client to a lower
bitrate stream if one is available.
8 Security Considerations
RTP packets using this payload format are subject to the security
considerations discussed in the RTP specification [3]. This implies
that the confidentiality of the media stream is achieved by using
encryption. Because the data compression used with this payload
format is applied end-to-end, encryption may be performed on the
compressed data. Where the size of a data block is set care must
be taken to prevent buffer overflows in the client applications.
<p>9 Acknowledgments
This I-D is a draft-moffitt-vorbis-rtp-00.txt. The MIME type
section is a continuation of draft-short-avt-rtp-vorbis-mime-00.txt
Thanks to the AVT, Ogg Vorbis Communities / Xiph.org team including
Steve Casner, Ralph Jiles, Tor-Einar Jarnbjo, John Lazarro, Jack
Moffitt, Colin Perkins, Barry Short, Mike Smith.
<p><p><p><p><p>Kerr Expires
August 20, 2003 [Page 10]
Internet Draft draft-kerr-avt-rtp-vorbis-01.txt February 20, 2003
<p>10 References
1. Key words for use in RFCs to Indicate Requirement Levels
(RFC 2119), S. Bradner.
2. libvorbis: Available from the Xiph website, http://www.xiph.org
3. RTP: A Transport Protocol for Real-Time Applications (RFC 1889),
Schulzrinne, et al.
4. RTP: A transport protocol for real-time applications. Work
in progress, draft-ietf-avt-rtp-new-11.txt.
5. RTP Profile for Audio and Video Conferences with Minimal Control.
Work in progress, draft-ietf-avt-profile-new-12.txt.
6. Ogg Vorbis I spec: Codec setup and packet decode.
http://www.xiph.org/ogg/vorbis/doc/vorbis-spec-ref.html
7. Ogg Vorbis I spec: Comment field and header specification.
http://www.xiph.org/ogg/vorbis/doc/v-comment.html
8. SDP: Session Description Protocol (RFC 2327), Handley, M. and
V. Jacobson.
<p>11 Full Copyright Statement
Copyright (C) The Internet Society (2003). All Rights Reserved.
This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.
<p><p><p><p><p><p>Kerr
Expires August 20, 2003 [Page 11]
Internet Draft draft-kerr-avt-rtp-vorbis-01.txt February 20, 2003
<p><p> This document and the information contained herein is
provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
<p>12 Authors Address
Phil Kerr
Centre for Music Technology
University of Glasgow
Glasgow, Scotland
UK, G12 8LT
Phone: +44 141 330 5740
Email: philkerr@elec.gla.ac.uk
WWW: http://www.xiph.org/
<p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p>Kerr
Expires August 20, 2003 [Page 12]
<p>--- >8 ----
List archives: http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'vorbis-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body. No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.
Reasonably Related Threads
- Vorbis RTP Internet Draft
- Updated Vorbis-RTP Internet Draft
- Update on Ogg-based IETF standard documents (MIME-types, file formats)
- Update on Ogg-based IETF standard documents (MIME-types, file formats)
- Update on Ogg-based IETF standard documents (MIME-types, file formats)