On Apr 21, 2007, at 12:36 PM, zmorris@mac.com wrote:> Hi, I tried both the stable and beta versions of the speex source > code download on Mac OS 10.4.9. I just do: > > ... > > However, when I play the output file, I get the header and a second > of audio, but the rest is just noise.I figured it out, the problem is that WAV files are little endian and I am on powerpc, so the sample byte order must be swapped. That still isn't the whole story, and I will explain how it's still not working, but first you need to update your example at: http://www.speex.org/docs/manual/speex-manual/ node13.html#SECTION000131000000000000000 To say something along the lines of "when using the example with WAV files on a big endian architecture like powerpc, you must swap the byte order of the samples with the following code:" //////////////////////////////////////// /* Define to 1 if your processor stores words with the most significant byte first (like Motorola and SPARC, unlike Intel and VAX). */ #define WORDS_BIGENDIAN 1 unsigned short be_short(unsigned short s) { unsigned short ret=s; #ifndef WORDS_BIGENDIAN ret = s>>8; ret += s<<8; #endif return ret; } unsigned short le_short(unsigned short s) { unsigned short ret=s; #ifdef WORDS_BIGENDIAN ret = s>>8; ret += s<<8; #endif return ret; } //////////////////////////////////////// These functions should really be inline or macros, and I believe the | operator is faster than + on RISC because it can be executed faster alongside << and >> in the pipeline but I could be wrong. Then you need to change the 2 lines that convert the samples to floats to say: //////////////////////////////////////// /*Copy the 16 bits values to float so Speex can work on them*/ for (i=0;i<FRAME_SIZE;i++) input[i] = le_short( in[i] ); /*Copy from float to short (16 bits) for output*/ for (i=0;i<FRAME_SIZE;i++) out[i] = le_short( output[i] ); //////////////////////////////////////// Also, endian issues are so important that they need to be discussed in the manual more. You should probably mention: #include <arpa/inet.h> // from transmitting on network ntohs() htons() from sockets, and: #include <CoreFoundation/CFByteOrder.h> // for import/export wav CFSwapInt16LittleToHost() CFSwapInt16HostToLittle() // from transmitting on network CFSwapInt16LittleToHost() CFSwapInt16HostToLittle() on Mac, right there on the example page. Also, please discuss the format of the compressed bits from squeex, because it's not clear to me if they are already big endian for sending over the network. The real kicker is that after doing this, I am still getting noise! I can make out the voice but it sounds 50% noisy now, and it's not the minimal noise that is always present. This is full-blown-every- other-sample-is-wrong noise. To trace it to this point, I downloaded and installed libogg from: http://xiph.org/downloads/ so that I could use speexenc and speexdec from the console, which worked perfectly. Then I traced it back to where it was reading samples and found the endian conversion. I will have to backtrack even further now to find out exactly why my test still isn't working, and it is becoming quite a waste of time. I've lost 2 days to this so far and it's looking like that might become 3. This is all with 1.2beta1 so I will try 1.0.5 and see if it fixes the noise problem. One way around all of this is to include more projects and not just focus on win32. An xcode project showing the example would be a good start. Adding a few lines to handle the WAV header would be ideal. --Zack
On Apr 21, 2007, at 3:53 PM, zmorris@mac.com wrote:> On Apr 21, 2007, at 12:36 PM, zmorris@mac.com wrote: > >> Hi, I tried both the stable and beta versions of the speex source >> code download on Mac OS 10.4.9. I just do: >> >> ... >> >> However, when I play the output file, I get the header and a >> second of audio, but the rest is just noise. > > I figured it out, the problem is that WAV files are little endian > and I am on powerpc, so the sample byte order must be swapped. > That still isn't the whole story, and I will explain how it's still > not working, but first you need to update your example at:OK I finally figured out the second noise problem. It's a riddle wrapped in a mystery inside an enigma. Judging by the somewhat odd structure of le_short() and be_short(), I think this keeps coming up over and over again. Even Apple's byte swapping macros fail under certain circumstances, and here's why: 1. Logical shifting must be used, not arithmetic (or else the high bit is wrong) 2. Conversion from short to float must be signed, not unsigned (or else the line level is wrong) 3. The order of operations must be written just-so, or gcc stumbles over the conversion from short to float The offending line was this one: for (i=0;i<FRAME_SIZE;i++) input[i] = le_short( in[i] ); This always fails with the current le_short() function. Here are updated versions, in both macro and inline form. You can leave inline off of the function version and it still works fine: //////////////////////////////////////// /* Define to 1 if your processor stores words with the most significant byte first (like Motorola and SPARC, unlike Intel and VAX). */ #define WORDS_BIGENDIAN 1 #ifdef WORDS_BIGENDIAN #define le_short( s ) ((short) ((unsigned short) (s) << 8) | ((unsigned short) (s) >> 8)) #define be_short( s ) ((short) (s)) #else #define le_short( s ) ((short) (s)) #define be_short( s ) ((short) ((unsigned short) (s) << 8) | ((unsigned short) (s) >> 8)) #endif inline short le_short( unsigned short s ) { #ifdef WORDS_BIGENDIAN return( (s << 8) | (s >> 8) ); #endif return s; } inline short be_short( unsigned short s ) { #ifndef WORDS_BIGENDIAN return( (s << 8) | (s >> 8) ); #endif return s; } //////////////////////////////////////// Now I am sorry to rant a moment, but I am getting too old for this stuff. Whenever I am speaking with fellow programmers, they love to jump to the conclusion that the problem is mine and that there is nothing wrong with the code. I am experienced enough to know that there are multitudes of reasons why things fail and that yes, even vanilla code examples often fail. I tried your sample code and it didn't work, plain and simple. UNIX is like this through and through. Whenever I try something, I just know it won't work, like I knew this wouldn't work. The fault is almost never mine (at least not through incompetence, because I follow directions explicitly), but is cause by assumptions about the infrastructure by myself and others. For all its benefits, the unix mindset can easily condemn us to a life of servitude, baby sitting our computers instead of getting real work done. Anyway, I hope this series of emails shows the importance of providing full examples for the target environments. If you ever want speex to catch on in the mainstream, everyday people need to be able to drop it in, but at this point only hackers can really do that. I mean the kind of hackers willing to spend 2 days to fix 2 lines of code that should "just work." So I thank you from the bottom of my heart for making such a brilliant piece of software as speex, I really appreciate what you guys stand for :) I just hope that you get some time to step back and polish up the sample code and manual just a hair. I know how it goes when you have to spend all of your time on development to keep something even just working, so this rant is not directed at anyone in particular :) --Zack
Hi, I tried both the stable and beta versions of the speex source code download on Mac OS 10.4.9. I just do: ./configure make sudo make install Then I added libspeex.a from /usr/local/lib and the headers to my xcode project. My app compiles and I'm able to call all of the speex functions. I copied the example code from the website and tweaked it to include the first 10000 bytes of the female.wav file (to include the header). However, when I play the output file, I get the header and a second of audio, but the rest is just noise. I've stared at it for hours and tried every combination of quality settings but it just doesn't work. Can someone please tell me what I am doing wrong? Thanx, and here is the code: --Zack //////////////////////////////////////// #include "speex/speex.h" #include <stdio.h> #define FRAME_SIZE 160 int main( void ) { char *inFile = "female.wav"; char *tempFile = "compressed.wav"; char *outFile = "speex-result.wav"; FILE *fin; FILE *fout; char header[10000]; // storage for wav header short in[FRAME_SIZE]; float input[FRAME_SIZE]; /*Holds the audio that will be written to file (16 bits per sample)*/ short out[FRAME_SIZE]; /*Speex handle samples as float, so we need an array of floats*/ float output[FRAME_SIZE]; char cbits[200]; int nbBytes; /*Holds the state of the encoder*/ void *state; /*Holds bits so they can be read and written to by the Speex routines*/ SpeexBits bits; int i, tmp; //////////////////// encoder //////////////////// /*Create a new encoder state in narrowband mode*/ state = speex_encoder_init(&speex_nb_mode); /*Set the quality to 8 (15 kbps)*/ tmp=8; speex_encoder_ctl(state, SPEEX_SET_QUALITY, &tmp); //inFile = argv[1]; fin = fopen(inFile, "r"); if( !fin ) return( -1 ); // couldn't find file fread(header, sizeof(char), 10000, fin); // get wav header fout = fopen(tempFile, "w"); /*Initialization of the structure that holds the bits*/ speex_bits_init(&bits); while (1) { /*Read a 16 bits/sample audio frame*/ fread(in, sizeof(short), FRAME_SIZE, fin); if (feof(fin)) break; /*Copy the 16 bits values to float so Speex can work on them*/ for (i=0;i<FRAME_SIZE;i++) input[i]=in[i]; /*Flush all the bits in the struct so we can encode a new frame*/ speex_bits_reset(&bits); /*Encode the frame*/ speex_encode(state, input, &bits); /*Copy the bits to an array of char that can be written*/ nbBytes = speex_bits_write(&bits, cbits, 200); /*Write the size of the frame first. This is what sampledec expects but it's likely to be different in your own application*/ fwrite(&nbBytes, sizeof(int), 1, fout); /*Write the compressed data*/ fwrite(cbits, 1, nbBytes, fout); } /*Destroy the encoder state*/ speex_encoder_destroy(state); /*Destroy the bit-packing struct*/ speex_bits_destroy(&bits); fclose(fin); fclose(fout); //////////////////// decoder //////////////////// /*Create a new decoder state in narrowband mode*/ state = speex_decoder_init(&speex_nb_mode); /*Set the perceptual enhancement on*/ tmp=1; speex_decoder_ctl(state, SPEEX_SET_ENH, &tmp); //outFile = argv[1]; fin = fopen(tempFile, "r"); fout = fopen(outFile, "w"); fwrite(header, sizeof(char), 10000, fout); /*Initialization of the structure that holds the bits*/ speex_bits_init(&bits); while (1) { /*Read the size encoded by sampleenc, this part will likely be different in your application*/ fread(&nbBytes, sizeof(int), 1, fin); fprintf (stderr, "nbBytes: %d\n", nbBytes); if (feof(fin)) break; /*Read the "packet" encoded by sampleenc*/ fread(cbits, 1, nbBytes, fin); /*Copy the data into the bit-stream struct*/ speex_bits_read_from(&bits, cbits, nbBytes); /*Decode the data*/ speex_decode(state, &bits, output); /*Copy from float to short (16 bits) for output*/ for (i=0;i<FRAME_SIZE;i++) out[i]=output[i]; /*Write the decoded audio to file*/ fwrite(out, sizeof(short), FRAME_SIZE, fout); } /*Destroy the decoder state*/ speex_decoder_destroy(state); /*Destroy the bit-stream truct*/ speex_bits_destroy(&bits); fclose(fin); fclose(fout); }