Greetings list, I am working on a project on which we wish to use Speex with Google Automatic Speech Recognition (ASR) to transcribe Speex audio being sent on to Google ASR service and return us the text of the spoken audio in the Speex audio stream. However, Google ASR's Speex support requires the off-standard Speex-with-header-byte format, and my group cannot find any worthwhile documentation on how we should properly encode that format. For educational value, we have initially referred to the following blog post, which mostly focuses on using FLAC for Google ASR: <http://mikepultz.com/2011/03/accessing-google-speech-api-chrome-11/> That article *does* mention the following project on GitHub which can write successfully a Speex-with-header-byte format file that we have confirmed to some degree that Google ASR will accept and render text of spoken audio: <https://github.com/QXIP/Speex-with-header-bytes> However, we have a chunk of our own code which attempts to duplicate that project in a new way, specifically for a Cocoa/Objective-C application, and unfortunately, it does not yet seem to yield data that Google ASR is willing to accept (we get "Bad Data" errors back if we send this data to them). I am permitted by my group to share with you the following body of code: CODE BELOW: SpeexRecorder::SpeexRecorder() { mFileCount = 0; mRecordPacket = 0; mRecordData = NULL; mAudioStreamer = NULL; int sampling_rate = 16000; memset(&bits_, 0, sizeof(bits_)); speex_bits_init(&bits_); encoder_state_ = speex_encoder_init(&speex_wb_mode); speex_encoder_ctl(encoder_state_, SPEEX_GET_FRAME_SIZE, &samples_per_frame_); int quality = kSpeexEncodingQuality; speex_encoder_ctl(encoder_state_, SPEEX_SET_QUALITY, &quality); int vbr = 1; speex_encoder_ctl(encoder_state_, SPEEX_SET_VBR, &vbr); memset(encoded_frame_data_, 0, sizeof(encoded_frame_data_)); } SpeexRecorder::~SpeexRecorder() { speex_bits_destroy(&bits_); speex_encoder_destroy(encoder_state_); } void SpeexRecorder::WriteToFile(int16 * buf, int count) { NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init]; count -= (count % samples_per_frame_); for (int i = 0; i < count; i += samples_per_frame_) { speex_encode_int(encoder_state_, (spx_int16_t*)buf, &bits_); int frame_length = speex_bits_write(&bits_, encoded_frame_data_ + 1, kMaxSpeexFrameLength); encoded_frame_data_[0] = static_cast<char>(frame_length); speex_bits_reset(&bits_); NSUserDefaults *defs = [NSUserDefaults standardUserDefaults]; NSData *dataToSend = [NSData dataWithBytes:encoded_frame_data_ length:frame_length]; NSArray *array = [NSArray arrayWithObjects:dataToSend, [defs objectForKey:@"inLang"], nil]; NSLog(@"WriteToFile -> dataToSend: [%d]", [dataToSend length]); [mAudioStreamer performSelectorOnMainThread:@selector(sendDataToServer:) withObject:array waitUntilDone:YES]; } [pool drain]; } void SpeexRecorder::OpenNextFile() { mFileCount++; NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init]; NSUserDefaults *defs = [NSUserDefaults standardUserDefaults]; if (mAudioStreamer) { // send 0 bytes to stream to signify the end // stream will be closed, causing http chunking to end and google to respond NSData *dataToSend = [NSData dataWithBytes:0 length:0]; NSLog(@"OpenNextFile -- #%d# -- [%d] bytes", mFileCount, [dataToSend length]); NSArray *array = [NSArray arrayWithObjects:dataToSend, [defs objectForKey:@"inLang"], nil]; [mAudioStreamer performSelectorOnMainThread:@selector(sendDataToServer:) withObject:array waitUntilDone:YES]; } else { mAudioStreamer = [[AudioStreamer alloc] init]; NSLog(@"OpenNextFile -- #%d# -- [%d] bytes", mFileCount, 0); [mAudioStreamer performSelectorOnMainThread:@selector(_setupConnection:) withObject:[defs objectForKey:@"inLang"] waitUntilDone:YES]; } [pool drain]; } CODE ABOVE: Is there anything obvious in my code here that we may have missed? I greatly appreciate any help you can offer, I apologize in-advance if this is the wrong place to post such messages, and finally I understand that this off-standard (likely) way of encoding Speex may not be supportable by the members viewing this list and place no particular weight on lack of response or lack of ability for you kind folks to help us with this problem. Thanks in-advance for your time and willingness to consider our situation! --Quinn Ebert PS: My apologies if two copies of this e-mail are received. I tried to send this ahead of receiving my subscription confirmation e-mail due to that e-mail taking about an hour to arrive. :-( -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.xiph.org/pipermail/speex-dev/attachments/20120224/e114d05b/attachment-0001.htm