Pontus Carlsson wrote:>Thanks! > >Btw, have you tried using SBR-technology or similar with speech codecs? That >might be a good idea I thought.. But I don't know if it produces as good >quality with speech codecs as it does for music codecs. Do you know if there >is any open source variant of SBR? >SBR exploits a limitation of your ears. At high frequencies (like over 10kHz) you cannot determine pitch with any accuracy. You hear up to 15kHz to 20kHz (depending on age and other factors), but you really cannot identify pitch at these frequencies. You cannot even determine if content above about 10kHz is properly harmonically related to the lower pitched fundamentals which usually give rise to them. I don't know of any voice specific coder that even attempts to capture energy above 10kHz. SBR just isn't relevent. Most wideband speech coding captures only 7kHz to 8kHz bandwidth. The key improvement that gives over the 3kHz to 4kHz most mainstream voice coders capture is to clean up unvoiced sounds. fffff, sssss, and other unvoiced sounds appear almost the same at telephone bandwidth. At 7kHz bandwidth they have enough character to make them more distinguishable. The basic intelligibility improvement you get is usually small. However, the voice is rather more pleasant and less tiring to listen to. That brings considerable intelligibility improvements in a long discussion. Adding energy up to the limit of hearing adds more to the pleasantness of the voice, but it isn't usually considered enough to get people excited about commiting extra bits per second to it. Regards, Steve <p><p>--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'speex-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
Thanks! Btw, have you tried using SBR-technology or similar with speech codecs? That might be a good idea I thought.. But I don't know if it produces as good quality with speech codecs as it does for music codecs. Do you know if there is any open source variant of SBR? /Pontus -----Ursprungligt meddelande----- Från: owner-speex-dev@xiph.org [mailto:owner-speex-dev@xiph.org]För Jean-Marc Valin Skickat: den 13 oktober 2002 05:57 Till: speex Ämne: Re: [speex-dev] Speex modes <p>> I'm about finished developing a QuickTime component that supports Speex> (on > MacOS X and Windows).. As it is now the user can set complexity > (SPEEX_SET_COMPLEXITY) and quality (SPEEX_SET_QUALITY / > SPEEX_SET_VBR_QUALITY) and to wether to use VBR or not. Will these > options > make it possible to produce all combinations of bitrates/qualities? Or > should I also use SPEEX_SET_MODE/SPEEX_SET_LOW_MODE/SPEEX_SET_HIGH_MODE > to > accomplish this?The first thing to know that setting quality from 0-10 is in fact a more user-friendly of setting the mode. That being said, for narrowband encoding, all modes are available with at least one quality setting (sometimes two quality settings point to the same mode because there are less than 10 modes). For wideband encoding, not all possible mode combination (one mode for the low-band, one for the high-band) are available with the 10 quality settings, but those that aren't available are mostly useless anyway (e.g. a combination that gives you very good quality above 4 kHz, but very poor below that is useless). In most cases I would suggest not making the modes directly available, unless maybe for "expert users". The only other place where it can be useful is that modes have a specific bit-rate/quality associated to them, while the mapping between the "quality settings" and the modes are not guarantied to remain the same in the future. That being said, you probably better keep what you have now. Hope this helps. Jean-Marc -- Jean-Marc Valin, M.Sc.A. LABORIUS (http://www.gel.usherb.ca/laborius) Université de Sherbrooke, Québec, Canada <p>--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'speex-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
Well, I don't know what SBR is, but there's something in the wideband mode that may be similar: It's possible to encode the whole 4-8 kHz band with just ~1-2 kbps by only encoding the (LPC) shape of the spectrum and then just filling that band with "something that makes sense". Quality is quite reasonable... Jean-Marc Le dim 13/10/2002 à 06:18, Steve Underwood a écrit :> Pontus Carlsson wrote: > > >Thanks! > > > >Btw, have you tried using SBR-technology or similar with speech codecs? That > >might be a good idea I thought.. But I don't know if it produces as good > >quality with speech codecs as it does for music codecs. Do you know if there > >is any open source variant of SBR? > > > SBR exploits a limitation of your ears. At high frequencies (like over > 10kHz) you cannot determine pitch with any accuracy. You hear up to > 15kHz to 20kHz (depending on age and other factors), but you really > cannot identify pitch at these frequencies. You cannot even determine if > content above about 10kHz is properly harmonically related to the lower > pitched fundamentals which usually give rise to them. > > I don't know of any voice specific coder that even attempts to capture > energy above 10kHz. SBR just isn't relevent. Most wideband speech coding > captures only 7kHz to 8kHz bandwidth. The key improvement that gives > over the 3kHz to 4kHz most mainstream voice coders capture is to clean > up unvoiced sounds. fffff, sssss, and other unvoiced sounds appear > almost the same at telephone bandwidth. At 7kHz bandwidth they have > enough character to make them more distinguishable. The basic > intelligibility improvement you get is usually small. However, the voice > is rather more pleasant and less tiring to listen to. That brings > considerable intelligibility improvements in a long discussion. Adding > energy up to the limit of hearing adds more to the pleasantness of the > voice, but it isn't usually considered enough to get people excited > about commiting extra bits per second to it. > > Regards, > Steve > > > > --- >8 ---- > List archives: http://www.xiph.org/archives/ > Ogg project homepage: http://www.xiph.org/ogg/ > To unsubscribe from this list, send a message to 'speex-dev-request@xiph.org' > containing only the word 'unsubscribe' in the body. No subject is needed. > Unsubscribe messages sent to the list will be ignored/filtered.-- Jean-Marc Valin, M.Sc.A. LABORIUS (http://www.gel.usherb.ca/laborius) Université de Sherbrooke, Québec, Canada -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 242 bytes Desc: signature.asc Url : http://lists.xiph.org/pipermail/speex-dev/attachments/20021013/ceda047e/signature-0001.pgp