thr3ads.net - Speex dev - [speex-dev] API suggestions [Aug 2004]

If this information is useful, please help other people find it:
Share via:

Jean-Marc Valin

2004-Aug-06 15:01 UTC

[speex-dev] API suggestions

> It is unusual to require a different sequence of API calls
> depending on whether the signal is Mono or Stereo.
> This is especially evident in the decoder where you would
> register a callback for the SPEEX_INBAND_STEREO message.
> The need for explicitly calling the speex_en/decode_stereo()
> function in addition to speex_en/decode() is cumbersome.
> All these tasks could be done by the API automatically,
> once it knows that it is dealing with a Stereo source.
I'll need to think about that one. Stereo support is really some
"extra"
information that the decoder can decide to use or ignore.
> *** Filling the "bit bucket".
> 
> An additional function similar to speex_bits_read_from()
> would come in handy. This should append the bytes passed
> into it to the bytes already in the buffer instead of
> replacing the buffer contents.
> 
> The intended use would be for streaming sources that bring
> in data in small portions (smaller than an encoded frame).
> This way the speex_bits could be used as a buffer.
> 
> Suggested name: speex_bits_append_from(). Or alternatively
> a flag passed into speex_bits_read_from() signalling whether
> to append or to replace.
If I understand what you said correctly, there is such a call:
speex_bits_read_whole_bytes, which adds a couple byte to the SpeexBits
struct while removing the ones that have already been read.
> *** Querying the codec if there's enough data for a full frame
>     decode and verifying the bitstream
> 
> It appears there is no way to tell in advance if the decoder
> has enough information in the speex_bits buffer to decode a full
> frame of audio. Sure, after a decode attempt one could see if
> speex_bits_remaining() returns a negative value, but this POST
> DECODE check would not allow to retry decoding gracefully.
> 
> A function like speex_verify_decode(st, &speex_bits) could
> perform two functions:
> - verify the syntax of the bitstream
> - verify that the input data has sufficient length for decoding
>   a frame
> - but NOT actually perform any decoding
The problem is that this is not possible. The reason is that if you cut
just before some optional extra information, you can't know if it's
there or not.
> This function would be required for progressively decoding a
> VBR bitstream that contains no side information (page sizes).
> Without it, I could not (easily) implement VBR decoding in my
> ACM codec. And storing frame sizes between frames is a waste
> of bits in a .WAV container ;)
I don't think it's a good idea to simply concatenate all the frames
together. For example, it makes it impossible to recover from corrupted
streams and seeking becomes complicated. I would suggest doing something
similar to what Ogg does. 

        Jean-Marc


-- 
Jean-Marc Valin, M.Sc.A.
LABORIUS (http://www.gel.usherb.ca/laborius)
Université de Sherbrooke, Québec, Canada

<p>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 190 bytes
Desc: Ceci est une partie de message numériquement signée
Url :
http://lists.xiph.org/pipermail/speex-dev/attachments/20030606/1700c513/signature-0001.pgp

Christian Buchner

2004-Aug-06 15:01 UTC

head link

[speex-dev] API suggestions

Hi there. I made a couple of noted regarding the speex API.
These are not meant be seen as critisism - rather constructive
suggestions.

I like speex a lot, but I would like it even more if some
of the following features were available. ;)

Christian

<p>Comments about the Speex API - from a developer's perspective
-------------------------------------------------------------

*** Unnecessary additional work for the developer on Stereo
    sources:

It is unusual to require a different sequence of API calls
depending on whether the signal is Mono or Stereo.
This is especially evident in the decoder where you would
register a callback for the SPEEX_INBAND_STEREO message.
The need for explicitly calling the speex_en/decode_stereo()
function in addition to speex_en/decode() is cumbersome.
All these tasks could be done by the API automatically,
once it knows that it is dealing with a Stereo source.

A unified API not requiring separate code paths would be
helpful.

<p>*** Filling the "bit bucket".

An additional function similar to speex_bits_read_from()
would come in handy. This should append the bytes passed
into it to the bytes already in the buffer instead of
replacing the buffer contents.

The intended use would be for streaming sources that bring
in data in small portions (smaller than an encoded frame).
This way the speex_bits could be used as a buffer.

Suggested name: speex_bits_append_from(). Or alternatively
a flag passed into speex_bits_read_from() signalling whether
to append or to replace.

<p>*** Querying the codec if there's enough data for a full frame
    decode and verifying the bitstream

It appears there is no way to tell in advance if the decoder
has enough information in the speex_bits buffer to decode a full
frame of audio. Sure, after a decode attempt one could see if
speex_bits_remaining() returns a negative value, but this POST
DECODE check would not allow to retry decoding gracefully.

A function like speex_verify_decode(st, &speex_bits) could
perform two functions:
- verify the syntax of the bitstream
- verify that the input data has sufficient length for decoding
  a frame
- but NOT actually perform any decoding

In case either of these conditions are not met, the decoder
could (for example) try to remedy the problem by 
- buffering more data
- notify the sender about corruption and/or request a resend

This function would be required for progressively decoding a
VBR bitstream that contains no side information (page sizes).
Without it, I could not (easily) implement VBR decoding in my
ACM codec. And storing frame sizes between frames is a waste
of bits in a .WAV container ;)

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'speex-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Christian Buchner

2004-Aug-06 15:01 UTC

head link

[speex-dev] API suggestions

> If I understand what you said correctly, there is such a call:
> speex_bits_read_whole_bytes, which adds a couple byte to the SpeexBits
> struct while removing the ones that have already been read.
Oh, how could I not have seen that one? That is exactly what I need.
Thanks for the pointer.
>> It is unusual to require a different sequence of API calls
>> depending on whether the signal is Mono or Stereo.
> I'll need to think about that one. Stereo support is really some
"extra"
> information that the decoder can decide to use or ignore.
Well in that case I would tell the API: "force decode to Mono". The
default mode however would be "decode to Mono or Stereo as present
in the bitstream".

This doesn't really have a high priority (the more I think about it).
>> - verify the syntax of the bitstream
>> - verify that the input data has sufficient length for decoding
>>   a frame
> The problem is that this is not possible. The reason is that if you cut
> just before some optional extra information, you can't know if it's
> there or not.
I will probably use a termination code in the bitstream following each
group of frames (inband signalling). This would apply to VBR modes only.
>> And storing frame sizes between frames is a waste of bits in a
>> .WAV container ;)
> I don't think it's a good idea to simply concatenate all the frames
> together. For example, it makes it impossible to recover from corrupted
> streams and seeking becomes complicated. I would suggest doing something
> similar to what Ogg does. 
For CBR this works perfectly. The block size in the WAVE file is defined
by the nBlockAlign value in the WAVEFORMATEX header. These blocks are
of a constant size and facilitate seeking. By grouping a number of frames
together the padding loss becomes minimal (or zero).

Example: CBR Quality 4 Stereo yields 309 bits per frame. I group eight
frames into a block and get 8*309 bits = 309 bytes for the nBlockAlign
value. No bits are lost to padding.

I see WAV primarily as an archival format. I an not very concerned
about corruption there. All reads and decodes are performed at multiples
of nBlockAlign and you will never have a critical out-of-sync issue.

However ACM and WAV does generally not cope well with the concept of a
variable bitrate (the lack of a seek index and the nBlockAlign methodology
are an indication of that). I can try... but I am not sure if it works
reliably.

<p>--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'speex-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Apparently Analagous Threads

Search for more maybe matching threads

Speex dev - Aug 2004 - API suggestions

[speex-dev] API suggestions

[speex-dev] API suggestions

[speex-dev] API suggestions

Apparently Analagous Threads