On Thu, 16 Oct 2008, Jean-Marc Valin wrote:> Aymeric Moizard a ?crit : >>> None of that is defined yet, though I'm open to suggestions on how to do >>> the mapping. >> >> CELT/44100 and CELT/48000 >> a=fmtp:105 stereo=on >> >> probably a latency value? > > It would definitely need a frame_size value > >> "CELT" doesn't seems to be used for any other existing audio related >> stuff? right? > > Not yet. It's very new (bit-stream not frozen yet), so at this point, > there are a few early adopters, but that's it. > >> I didn't read fully the doc... >> >> One question for you Jean-Marc: can you confirm that the decoder >> will receive enough information to autoconfigure itself when receiving >> RTP streams? One documentation sentence seems to not go in that >> direction while it's a requirement for VoIP and mainly for SIP >> negotiation. > > No. Unlike Speex, CELT will not be able to decode anything with no prior > information on the stream. To decode a stream of packets, CELT needs the > following information > 1) Sampling rate. It's not just for setting the soundcard's rate, if you > use the wrong rate, you get garbage. > 2) Mono/Stereo > 3) Frame size in samples > 4) If there's more than one frame in a packet, you need to tell it where > the boundary is. However, once that's done, CELT knows the bit-rate used > just from the packet size, so there's no need to signal a fixed > communication rate. > > The main reason CELT can't do like Speex (I wish it could) is that in > Speex, the overhead of transmitting the mode info is 5 bits for > narrowband and 9 bits for wideband. With 10 ms frames, that's just 250 > and 450 bps. With CELT, there would be a bit more data needed and the > frame size can be as small as 2 ms, so we could end up with several kbps > of mode signalling. In the current code, there's no signalling at all. > The good thing is that after a few frames, the decoder should at least > realise it's decoding garbage.I understand, but CELT would be useless for SIP if one can't read/guess correctly decoder configuration from the RTP data. One possible way to cope with this would be to have several CELT payload defines for use in SIP signalling. This is usually not well accepted as this would remove flexibility and increase size and error withing SIP negotiation. I don't think this requirement is only for SIP: any device receiving data would reasonably want to be sure it decodes it correctly. No matter the overhead. One approach which is very acceptable to me would be to have something like PPS/FPS for h264: you will send a special data packet (one bit?) to mark the packet as data or as decoding data. The first packet sent is a "decoding information data" content and other packets are "real data". With RTP, you will retranmit this packet regularly to cope with packet loss or delayed initiation (initial packets are often lost at the beginning on one side of the conversation). I think this approach will fit your need for keeping CELT as low as possible but mandatory for VoIP. tks, Aymeric MOIZARD / ANTISIP amsip - http://www.antisip.com osip2 - http://www.osip.org eXosip2 - http://savannah.nongnu.org/projects/exosip/
> I understand, but CELT would be useless for SIP if one can'tread/guess> correctly decoder configuration from the RTP data. > > One possible way to cope with this would be to have several CELTpayload> defines for use in SIP signalling. This is usually not well acceptedas> this would remove flexibility and increase size and error withing SIP > negotiation.Sorry for going OT, but this is a good example of why I dislike SIP; to me it seems like someone sticky-taped a bunch of not quite ideally suited standards together (HTTP? Really? Over UDP??), much of which is then ignored by people who 'just need to get calls working'. The result is that you might not be able to rely on something as fundamental as media attributes being parsed in an SDP. /rant, Dave
On Fri, Oct 17, 2008 at 8:38 AM, David Hogan <david.hogan at freshtel.net> wrote:>> I understand, but CELT would be useless for SIP if one can't > read/guess >> correctly decoder configuration from the RTP data. >> >> One possible way to cope with this would be to have several CELT payload >> defines for use in SIP signalling. This is usually not well accepted as >> this would remove flexibility and increase size and error withing SIP >> negotiation. > > Sorry for going OT, but this is a good example of why I dislike SIP; to > me it seems like someone sticky-taped a bunch of not quite ideally > suited standards together (HTTP? Really? Over UDP??), much of which is > then ignored by people who 'just need to get calls working'. The result > is that you might not be able to rely on something as fundamental as > media attributes being parsed in an SDP.I'm sorry to touch your feelings, but this problem will arise not only with SIP, but with any protocol, which tries to get common codec set with SDP. So, the real problem here is SDP rather then SIP, and this is well known. -- Regards, Alexander Chemeris. SIPez LLC. SIP VoIP, IM and Presence Consulting http://www.SIPez.com tel: +1 (617) 273-4000
Use Jingle, anyway Jingle kicks SIP on almost every aspect. Especially on the way the standards are made. Diana David Hogan wrote:>> I understand, but CELT would be useless for SIP if one can't >> > read/guess > >> correctly decoder configuration from the RTP data. >> >> One possible way to cope with this would be to have several CELT >> > payload > >> defines for use in SIP signalling. This is usually not well accepted >> > as > >> this would remove flexibility and increase size and error withing SIP >> negotiation. >> > > Sorry for going OT, but this is a good example of why I dislike SIP; to > me it seems like someone sticky-taped a bunch of not quite ideally > suited standards together (HTTP? Really? Over UDP??), much of which is > then ignored by people who 'just need to get calls working'. The result > is that you might not be able to rely on something as fundamental as > media attributes being parsed in an SDP. > > /rant, > Dave > _______________________________________________ > Speex-dev mailing list > Speex-dev at xiph.org > http://lists.xiph.org/mailman/listinfo/speex-dev >
>> The main reason CELT can't do like Speex (I wish it could) is that in >> Speex, the overhead of transmitting the mode info is 5 bits for >> narrowband and 9 bits for wideband. With 10 ms frames, that's just 250 >> and 450 bps. With CELT, there would be a bit more data needed and the >> frame size can be as small as 2 ms, so we could end up with several kbps >> of mode signalling. In the current code, there's no signalling at all. >> The good thing is that after a few frames, the decoder should at least >> realise it's decoding garbage. > > I understand, but CELT would be useless for SIP if one can't read/guess > correctly decoder configuration from the RTP data.Why is that? Isn't it the whole point of SDP that you first negotiate before sending data?> One possible way to cope with this would be to have several CELT payload > defines for use in SIP signalling. This is usually not well accepted as > this would remove flexibility and increase size and error withing SIP > negotiation.There are too many possible parameter combinations to make that viable anyway.> I don't think this requirement is only for SIP: any device receiving > data would reasonably want to be sure it decodes it correctly. No > matter the overhead.That's a totally different issue. What CELT does is that there's always about 1 bit that's unused when the encoder is done. CELT encodes a known value there, but because it's encoded with the range coder, a decoder not using the exact same mode will get something random and the check for that known value will fail. In practice, it only takes a couple frames before the decoder realises that there's an error (it can't tell whether it's an error in the transmission or a mode/version mismatch).> One approach which is very acceptable to me would be to have something > like PPS/FPS for h264: you will send a special data packet (one bit?) > to mark the packet as data or as decoding data. > > The first packet sent is a "decoding information data" content and > other packets are "real data". With RTP, you will retranmit this packet > regularly to cope with packet loss or delayed initiation (initial packets > are often lost at the beginning on one side of the conversation).That's something that *could* be done. What I'm not sure about is how complicated it would make it for clients to implement. I think there was a similar issue with Vorbis (which *requires* a large header before you're able to decode anything) and it got messy. But again, I don't know the details.> I think this approach will fit your need for keeping CELT as > low as possible but mandatory for VoIP.What do you mean here? Jean-Marc> tks, > Aymeric MOIZARD / ANTISIP > amsip - http://www.antisip.com > osip2 - http://www.osip.org > eXosip2 - http://savannah.nongnu.org/projects/exosip/