Nathaniel Meyer
2004-Sep-14 15:09 UTC
[Speex-dev] Speex encoding/decoding producing garbled audio
Whoops, left this message in my outbox. I managed to fix the problem. Apparently I was only copying 160 bytes (Frame Size) back into the audio stream when I should have been copying 320 (chars <-> shorts confused me there). Hence why I could hear myself yet it was distorted. Half the wav was missing =) To answer some of the other questions here, for any insight into what I'm doing: Colin, I'm using DirectSoundCapture to get the audio. I set it up so I only have one capture buffer running and one stream buffer per person. Only active people have their buffers running while others remain dormant. I noticed Microsoft just fills the buffer with silence and continues playing it, but I figure that's going to have an impact on performance. To eliminate garbled playback, I also set a timer so that streamed audio only plays after X milliseconds from the first packet received. Something along the lines of 300-500 milliseconds, which is enough time to give the stream a head start, unless you have a really bad connection. But for testing purposes with this API I'm feeding the captured audio directly into the encoder / decoder with no network. Thanks for the data conversion tip too. Reed, DirectSound uses void* as a datatype. In general it takes bytes (for playback), but when in 16-bit mode it expects shorts for capturing packets. It accepts ranges from -128 -> 127 (8-bit mode) or -32768 -> 32767 (16-bit mode).Currently I'm recording mono 8000Hz at 16-bits/sample and I'm expecting the decoder to produce the same. Default settings show it operates with 1.875KB/sec bitrate, 8000Hz sample, and 16-bits/sample. Seems appropriate. Nate> ----- Original Message ----- > From: "Nathaniel Meyer" <nath_meyer@hotmail.com> > To: <speex-dev@xiph.org> > Sent: Sunday, September 12, 2004 7:03 PM > Subject: [Speex-dev] Speex encoding/decoding producing garbled audio > > >> I'm getting garbled playback with decoded fragments and I'm hoping >> someone here can point me in the right direction to correcting the >> problem. >> >> >> >> Essentially I'm capturing audio from the microphone. I stream it over the >> net, but for testing purposes with this API I'm just grabbing the whole >> chunk and encoding / decoding it right away and then updating the sound >> buffer for playback. The playback sounds very scratchy with a bit of a >> buzz sound and some skipping; yet I can still make it out somewhat. At >> first I though maybe I was doing data conversion between bytes and shorts >> incorrectly, so I temporarily moved over to a short-based system. Still >> the problem persisted, so perhaps it could be a setting or two I'm >> missing? I posted my code below, demonstrating how I'm encoding and >> decoding the buffers. I can't see anything wrong with it, so I'm guessing >> my problem lies elsewhere. If anyone experienced a similar problem >> beforehand, it would be nice to know what I could be doing wrong. As far >> as the system itself, I can perfectly record audio at any channel >> setting, sample rate, or bit-rate and play it back fine. >> >> >> >> - I'm using Speex version 1.1.6. I've also used 1.0.4 beforehand and >> experienced the same problem with it. >> >> >> >> 1) I initialize the bits, encoder, and decoder as normal (default >> settings seemed appropriate): >> >> speex_bits_init(&mBits); >> >> mEncode = speex_encoder_init(&speex_nb_mode); >> >> mDecode = speex_decoder_init(&speex_nb_mode); >> >> >> >> 2) I record my audio at mono 8000Hz, 16bits per sample. >> >> >> >> 3) I encode frame-sized (320 bytes) fragments. Since I deal only with >> char data types, I convert to 2-byte short values first and then set the >> float buffer. >> >> char *CSpeex::encode (char *buffer, int size, int &encodeSize) >> >> { >> >> char *encodedBuffer = new char[160]; >> >> short speexShort; >> >> float *speexFloat = new float[160]; >> >> >> >> // Convert the audio to a short then to a float buffer >> >> for (int i = 0; i < 160; i++) >> >> { >> >> memcpy(&speexShort, &buffer[i*2], sizeof(short)); >> >> speexFloat[i] = speexShort; >> >> } >> >> >> >> // Encode the sound data using the float buffer >> >> speex_bits_reset(&mBits); >> >> speex_encode(mEncode, speexFloat, &mBits); >> >> encodeSize = speex_bits_write(&mBits, encodedBuffer, 160); >> >> delete[] speexFloat; >> >> >> >> // Return the encoded buffer >> >> return encodedBuffer; >> >> } >> >> >> >> 4) I immediately decode the encoded buffer. Encoded size is always 38 >> bytes for this sample set and expected decoded size is 320 bytes >> >> char *CSpeex::decode (char *buffer, int encodeSize) >> >> { >> >> char *decodedBuffer = new char[320]; >> >> short speexShort; >> >> float *speexFloat = new float[160]; >> >> >> >> // Decode the sound data into a float buffer >> >> speex_bits_reset(&mBits); >> >> speex_bits_read_from(&mBits, buffer, encodeSize); >> >> speex_decode(mDecode, &mBits, speexFloat); >> >> >> >> // Convert from float to short to char >> >> for (int i = 0; i < 160; i++) >> >> { >> >> speexShort = speexFloat[i]; >> >> memcpy(&decodedBuffer[i*2], &speexShort, sizeof(short)); >> >> } >> >> delete[] speexFloat; >> >> >> >> // Return the buffer >> >> return decodedBuffer; >> >> } >> >> >> >> >> >> Hope no one minds the source post. I'm really stumped on this one, but >> the benefits of using Speex versus the bloat offered in the competitors >> are well worth the hassle. I'm looking forward to incorporating this into >> several games for VoIP support. >> >> >> >> >> >> Thanks. >> _______________________________________________ >> Speex-dev mailing list >> Speex-dev@xiph.org >> http://lists.xiph.org/mailman/listinfo/speex-dev >> >