On 05/04/2011 08:49 PM, Juha Heinanen wrote:> Juha Heinanen writes:
>
>> any idea why size of wav file doubles when it is encoded to speex and
>> back to wav:
>>
>> $ ls -ls testi.wav
>> 40 -rw-r--r-- 1 foo foo 40674 May 4 14:38 testi.wav
>>
>> $ speexenc --denoise --agc --quality 10 testi.wav testi.spx
>> Encoding 8000 Hz audio using narrowband mode (mono)
>>
>> $ ls -ls testi.spx
>> 20 -rw-r--r-- 1 foo foo 16405 May 4 14:46 testi.spx
>>
>> $ speexdec --mono --rate 8000 testi.spx testi.wav
>> Decoding 8000 Hz audio using narrowband mode (mono)
>> Encoded with Speex 1.2rc1
>>
>> $ ls -ls testi.wav
>> 84 -rw-r--r-- 1 foo foo 81464 May 4 14:48 testi.wav
> i guess the answer is that wav file produced by speexdec is 16 bit when
> the input wav file to speexenc was 8 bit. if i use sox to convert the
> output wav file from 16 bit to 8 bit, audio quality gets really bad.
>
If you are interested in assessing voice quality *never* use 8 bit audio
files. Unless the voice level is fairly constant, and modulating the 8
bits quite well, the quality of your original will be poor. Nobody uses
8 bit audio for anything serious *.
* Some people get confused about this because the telephone network uses
8 bits per sample, and they think its plain 8 bit audio. The telephone
network actually uses one of 2 forms of lossy compression - uLaw in the
US and a few other places, and ALaw everywhere else. The audio before
compression is about 12 to 13 bits with these codecs. 12 bits is enough
to handle the dynamic range of a voice reasonably well.
Steve