> It is unusual to require a different sequence of API calls > depending on whether the signal is Mono or Stereo. > This is especially evident in the decoder where you would > register a callback for the SPEEX_INBAND_STEREO message. > The need for explicitly calling the speex_en/decode_stereo() > function in addition to speex_en/decode() is cumbersome. > All these tasks could be done by the API automatically, > once it knows that it is dealing with a Stereo source.I'll need to think about that one. Stereo support is really some "extra" information that the decoder can decide to use or ignore.> *** Filling the "bit bucket". > > An additional function similar to speex_bits_read_from() > would come in handy. This should append the bytes passed > into it to the bytes already in the buffer instead of > replacing the buffer contents. > > The intended use would be for streaming sources that bring > in data in small portions (smaller than an encoded frame). > This way the speex_bits could be used as a buffer. > > Suggested name: speex_bits_append_from(). Or alternatively > a flag passed into speex_bits_read_from() signalling whether > to append or to replace.If I understand what you said correctly, there is such a call: speex_bits_read_whole_bytes, which adds a couple byte to the SpeexBits struct while removing the ones that have already been read.> *** Querying the codec if there's enough data for a full frame > decode and verifying the bitstream > > It appears there is no way to tell in advance if the decoder > has enough information in the speex_bits buffer to decode a full > frame of audio. Sure, after a decode attempt one could see if > speex_bits_remaining() returns a negative value, but this POST > DECODE check would not allow to retry decoding gracefully. > > A function like speex_verify_decode(st, &speex_bits) could > perform two functions: > - verify the syntax of the bitstream > - verify that the input data has sufficient length for decoding > a frame > - but NOT actually perform any decodingThe problem is that this is not possible. The reason is that if you cut just before some optional extra information, you can't know if it's there or not.> This function would be required for progressively decoding a > VBR bitstream that contains no side information (page sizes). > Without it, I could not (easily) implement VBR decoding in my > ACM codec. And storing frame sizes between frames is a waste > of bits in a .WAV container ;)I don't think it's a good idea to simply concatenate all the frames together. For example, it makes it impossible to recover from corrupted streams and seeking becomes complicated. I would suggest doing something similar to what Ogg does. Jean-Marc -- Jean-Marc Valin, M.Sc.A. LABORIUS (http://www.gel.usherb.ca/laborius) Université de Sherbrooke, Québec, Canada <p> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 190 bytes Desc: Ceci est une partie de message numériquement signée Url : http://lists.xiph.org/pipermail/speex-dev/attachments/20030606/1700c513/signature-0001.pgp
Hi there. I made a couple of noted regarding the speex API. These are not meant be seen as critisism - rather constructive suggestions. I like speex a lot, but I would like it even more if some of the following features were available. ;) Christian <p>Comments about the Speex API - from a developer's perspective ------------------------------------------------------------- *** Unnecessary additional work for the developer on Stereo sources: It is unusual to require a different sequence of API calls depending on whether the signal is Mono or Stereo. This is especially evident in the decoder where you would register a callback for the SPEEX_INBAND_STEREO message. The need for explicitly calling the speex_en/decode_stereo() function in addition to speex_en/decode() is cumbersome. All these tasks could be done by the API automatically, once it knows that it is dealing with a Stereo source. A unified API not requiring separate code paths would be helpful. <p>*** Filling the "bit bucket". An additional function similar to speex_bits_read_from() would come in handy. This should append the bytes passed into it to the bytes already in the buffer instead of replacing the buffer contents. The intended use would be for streaming sources that bring in data in small portions (smaller than an encoded frame). This way the speex_bits could be used as a buffer. Suggested name: speex_bits_append_from(). Or alternatively a flag passed into speex_bits_read_from() signalling whether to append or to replace. <p>*** Querying the codec if there's enough data for a full frame decode and verifying the bitstream It appears there is no way to tell in advance if the decoder has enough information in the speex_bits buffer to decode a full frame of audio. Sure, after a decode attempt one could see if speex_bits_remaining() returns a negative value, but this POST DECODE check would not allow to retry decoding gracefully. A function like speex_verify_decode(st, &speex_bits) could perform two functions: - verify the syntax of the bitstream - verify that the input data has sufficient length for decoding a frame - but NOT actually perform any decoding In case either of these conditions are not met, the decoder could (for example) try to remedy the problem by - buffering more data - notify the sender about corruption and/or request a resend This function would be required for progressively decoding a VBR bitstream that contains no side information (page sizes). Without it, I could not (easily) implement VBR decoding in my ACM codec. And storing frame sizes between frames is a waste of bits in a .WAV container ;) --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'speex-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
> If I understand what you said correctly, there is such a call: > speex_bits_read_whole_bytes, which adds a couple byte to the SpeexBits > struct while removing the ones that have already been read.Oh, how could I not have seen that one? That is exactly what I need. Thanks for the pointer.>> It is unusual to require a different sequence of API calls >> depending on whether the signal is Mono or Stereo. > I'll need to think about that one. Stereo support is really some "extra" > information that the decoder can decide to use or ignore.Well in that case I would tell the API: "force decode to Mono". The default mode however would be "decode to Mono or Stereo as present in the bitstream". This doesn't really have a high priority (the more I think about it).>> - verify the syntax of the bitstream >> - verify that the input data has sufficient length for decoding >> a frame > The problem is that this is not possible. The reason is that if you cut > just before some optional extra information, you can't know if it's > there or not.I will probably use a termination code in the bitstream following each group of frames (inband signalling). This would apply to VBR modes only.>> And storing frame sizes between frames is a waste of bits in a >> .WAV container ;) > I don't think it's a good idea to simply concatenate all the frames > together. For example, it makes it impossible to recover from corrupted > streams and seeking becomes complicated. I would suggest doing something > similar to what Ogg does.For CBR this works perfectly. The block size in the WAVE file is defined by the nBlockAlign value in the WAVEFORMATEX header. These blocks are of a constant size and facilitate seeking. By grouping a number of frames together the padding loss becomes minimal (or zero). Example: CBR Quality 4 Stereo yields 309 bits per frame. I group eight frames into a block and get 8*309 bits = 309 bytes for the nBlockAlign value. No bits are lost to padding. I see WAV primarily as an archival format. I an not very concerned about corruption there. All reads and decodes are performed at multiples of nBlockAlign and you will never have a critical out-of-sync issue. However ACM and WAV does generally not cope well with the concept of a variable bitrate (the lack of a seek index and the nBlockAlign methodology are an indication of that). I can try... but I am not sure if it works reliably. <p>--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'speex-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.