----- Original Message ----- From: "Alexander Chemeris" <Alexander.Chemeris at sipez.com> To: "Vincent Burel" <vincent.burel at vb-audio.com> Cc: "Conrad Parker" <conrad at metadecks.org>; <speex-dev at xiph.org>; "Jean-Marc Valin" <jean-marc.valin at usherbrooke.ca> Sent: Thursday, November 13, 2008 11:31 PM Subject: Re: [Speex-dev] SPEEX on iPhone ?> On Thu, Nov 13, 2008 at 11:23 AM, Vincent Burel > <vincent.burel at vb-audio.com> wrote: > >> 2008/11/13 Vincent Burel <vincent.burel at vb-audio.com>: > >> > could you explain why 44.1kHz is *not* recommended with SPEEX ? > >> > >> my understanding is that the speex modes are tuned for particular > >> sample rates, so using the 32kHz mode with 44.1kHz data will not yield > >> as good quality as using the 32kHz mode with 32kHz data. > > > > ok, it's strange, by nature, such compression algorithm are able to make > > down or upsampling very well... > > Do there is an impact on CPU load ? > > Nothing strange if you ever worked with speech compression before.yes, also you could add : if i had any clue in audio domain ! :-)> Speech compression algorithms always are tunned to particular freq, > else they would take tons of time. That's because they use knowledge > that speech pitch (and other params) lies in well specified regions. > Thus if you feed algorithm with wrong freq, you'll fool it and it might > even not detect speech at all.Do you mean that speex is working at 44.1 Khz like at 32Khz without samplerate conversion ? It seems you suggest that working with 44.1 Khz signal instead of 32Khz (wich is the native samplingrate for the ultra -wide-band mode as far as i understand) ask for more CPU load. Since YOU have already worked with SPEEX, do you know how much CPU load it takes ? 10% more ? 50% more ?> In short - you must do up/down-sampling by yourself. There're a lot of > nice libraries for this. One of them is libspeexdsp. ;)So you confirm that SPEEX does not do it ? it works with 44.1Khz like with 32KHz ? thanks by advance Regards Vincent Burel
On Fri, Nov 14, 2008 at 3:57 AM, Vincent Burel <vincent.burel at vb-audio.com>wrote:> > Speech compression algorithms always are tunned to particular freq, > > else they would take tons of time. That's because they use knowledge > > that speech pitch (and other params) lies in well specified regions. > > Thus if you feed algorithm with wrong freq, you'll fool it and it might > > even not detect speech at all. > > Do you mean that speex is working at 44.1 Khz like at 32Khz without > samplerate conversion ? > > It seems you suggest that working with 44.1 Khz signal instead of 32Khz > (wich is the native samplingrate for the ultra -wide-band mode as far as i > understand) ask for more CPU load. Since YOU have already worked with > SPEEX, > do you know how much CPU load it takes ? 10% more ? 50% more ?Standard sample rate used with speex is 8kHz. Speex wideband uses 16kHz Speex ultra-wideband uses 32kHz Many speech applications use use just the standard 8kHz speex mode. What Alex is saying is, you'll (probably) want to record from sound card at 44.1kHz or 48kHz (depending on native rate of sound card-- which isn't always easy to find out), then use a software resampler like speexdsp provides (but any implementation will do) to resample the data down to your target sample rate for speex (8kHz for standard speex, 16kHz for wb, 32kHz for uwb). Then push it through speex routines, send it where it needs to be, decode with speex routines, then resample it back up to 44.1/48kHz, and play it out at the destination end. This of course, presumes that your audio source and sink are provided by the myriad of sound cards that have been designed for Windows, and are supported by other OSes. Basically, sound cards these days are designed to work with only one or two sample rates, and their drivers do the resampling to get it to target/destination sample rate. Quite a few years back (so some google research suggests), Microsoft disseminated some faulty resampling reference code that just got integrated with a lot of sound card drivers, and exists to this day (so they say -- take this with a grain of salt though). One thing I can be sure of (I have seen), is that if you resample the data to the sound card's native sample rate (which, I think it can be safe that most all sound cards use 48kHz or 44.1kHz), latency (definitely) and jitter(maybe?) goes down pretty significantly. -- Keith Kyzivat SIPez LLC. SIP VoIP, IM and Presence Consulting http://www.SIPez.com tel: +1 (617) 273-4000 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.xiph.org/pipermail/speex-dev/attachments/20081114/4cc23d8a/attachment-0001.htm
you could use a simple filter like the one below. i just copied it from my code. figure things out for yourself. it down samples 16KHz to 8KHz. - farhan p->coef[0] = 2087; p->coef[1] = 5979; p->coef[2] = -7282; p->coef[3] = -6747; p->coef[4] = 28284; p->coef[5] = 52763; p->coef[6] = 28284; p->coef[7] = -6747; p->coef[8] = -7282; p->coef[9] = 5979; p->coef[10] = 2087; p->coef[11] = -4816; /* it block converts 320 samples at a time */ static void lowpass(struct Call *pc, short *pcm, int nsamples) { int i, j; int sum; short out[6400]; for (i = 0; i < nsamples; i++){ pc->delay[0] = (int)pcm[i]; //update most recent sample sum = 0; for (j=0; j<11; j++){ sum += (pc->coef[j]*-pc->delay[j]); //multiply sample by filter coef.s } out[i]=(short)(sum/100000); //let sample at destination = filtered sample for (j=11; j>0; j--) //shift sample pc->delay[j] = pc->delay[j-1]; } for (i = 0; i < nsamples; i++) pcm[i] = out[i]; } On Fri, Nov 14, 2008 at 8:04 PM, Keith Kyzivat <kkyzivat at tripleplayint.com> wrote:> > > On Fri, Nov 14, 2008 at 3:57 AM, Vincent Burel <vincent.burel at vb-audio.com> > wrote: >> >> > Speech compression algorithms always are tunned to particular freq, >> > else they would take tons of time. That's because they use knowledge >> > that speech pitch (and other params) lies in well specified regions. >> > Thus if you feed algorithm with wrong freq, you'll fool it and it might >> > even not detect speech at all. >> >> Do you mean that speex is working at 44.1 Khz like at 32Khz without >> samplerate conversion ? >> >> It seems you suggest that working with 44.1 Khz signal instead of 32Khz >> (wich is the native samplingrate for the ultra -wide-band mode as far as i >> understand) ask for more CPU load. Since YOU have already worked with >> SPEEX, >> do you know how much CPU load it takes ? 10% more ? 50% more ? > > Standard sample rate used with speex is 8kHz. > Speex wideband uses 16kHz > Speex ultra-wideband uses 32kHz > Many speech applications use use just the standard 8kHz speex mode. > What Alex is saying is, you'll (probably) want to record from sound card at > 44.1kHz or 48kHz (depending on native rate of sound card-- which isn't > always easy to find out), then use a software resampler like speexdsp > provides (but any implementation will do) to resample the data down to your > target sample rate for speex (8kHz for standard speex, 16kHz for wb, 32kHz > for uwb). Then push it through speex routines, send it where it needs to > be, decode with speex routines, then resample it back up to 44.1/48kHz, and > play it out at the destination end. > This of course, presumes that your audio source and sink are provided by the > myriad of sound cards that have been designed for Windows, and are supported > by other OSes. > Basically, sound cards these days are designed to work with only one or two > sample rates, and their drivers do the resampling to get it to > target/destination sample rate. Quite a few years back (so some google > research suggests), Microsoft disseminated some faulty resampling reference > code that just got integrated with a lot of sound card drivers, and exists > to this day (so they say -- take this with a grain of salt though). One > thing I can be sure of (I have seen), is that if you resample the data to > the sound card's native sample rate (which, I think it can be safe that most > all sound cards use 48kHz or 44.1kHz), latency (definitely) and > jitter(maybe?) goes down pretty significantly. > -- > Keith Kyzivat > > SIPez LLC. > SIP VoIP, IM and Presence Consulting > http://www.SIPez.com > tel: +1 (617) 273-4000 > > _______________________________________________ > Speex-dev mailing list > Speex-dev at xiph.org > http://lists.xiph.org/mailman/listinfo/speex-dev > >
----- Original Message ----- From: Keith Kyzivat To: Vincent Burel Cc: Alexander Chemeris ; speex-dev at xiph.org ; Jean-Marc Valin Sent: Friday, November 14, 2008 3:34 PM Subject: Re: [Speex-dev] SPEEX on iPhone ? On Fri, Nov 14, 2008 at 3:57 AM, Vincent Burel <vincent.burel at vb-audio.com> wrote: > Speech compression algorithms always are tunned to particular freq, > else they would take tons of time. That's because they use knowledge > that speech pitch (and other params) lies in well specified regions. > Thus if you feed algorithm with wrong freq, you'll fool it and it might > even not detect speech at all. Do you mean that speex is working at 44.1 Khz like at 32Khz without samplerate conversion ? It seems you suggest that working with 44.1 Khz signal instead of 32Khz (wich is the native samplingrate for the ultra -wide-band mode as far as i understand) ask for more CPU load. Since YOU have already worked with SPEEX, do you know how much CPU load it takes ? 10% more ? 50% more ? Standard sample rate used with speex is 8kHz. Speex wideband uses 16kHz Speex ultra-wideband uses 32kHz Many speech applications use use just the standard 8kHz speex mode. What Alex is saying is, you'll (probably) want to record from sound card at 44.1kHz or 48kHz (depending on native rate of sound card-- which isn't always easy to find out), then use a software resampler like speexdsp provides (but any implementation will do) to resample the data down to your target sample rate for speex (8kHz for standard speex, 16kHz for wb, 32kHz for uwb). Then push it through speex routines, send it where it needs to be, decode with speex routines, then resample it back up to 44.1/48kHz, and play it out at the destination end. This of course, presumes that your audio source and sink are provided by the myriad of sound cards that have been designed for Windows, and are supported by other OSes. Basically, sound cards these days are designed to work with only one or two sample rates, and their drivers do the resampling to get it to target/destination sample rate. Quite a few years back (so some google research suggests), Microsoft disseminated some faulty resampling reference code that just got integrated with a lot of sound card drivers, and exists to this day (so they say -- take this with a grain of salt though). One thing I can be sure of (I have seen), is that if you resample the data to the sound card's native sample rate (which, I think it can be safe that most all sound cards use 48kHz or 44.1kHz), latency (definitely) and jitter(maybe?) goes down pretty significantly. Well, thanks for your reply about sound card... but it was not really the question. First, i remind that i already use SPEEX at 44.1 Khz without converting signal to native 32Khz, and it works very well. So now i would like to get reply to my both original questions if possible : 1- Does SPEEX is doing the down/upsampling when working with 44.1Khz signal or does it process this signal as it was 32Khz ? 1- if i setup SPEEX in 32Khz and do the down/upsampling myself, what does it bring ? better quality ? much CPU performance ? how much ? Thanks by advance. Regards Vincent Burel -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.xiph.org/pipermail/speex-dev/attachments/20081115/344d4d3c/attachment.htm