Lefteris Zafiris
2012-Jan-04 02:42 UTC
[asterisk-users] Speech recognition in asterisk using google voice API
Hello, I have written an agi script that uses google voice API for voice recognition. The script records from the current channel untill the pound key (#) is pressed or the timeout (15 seconds) is reached. The recording is send over to google speech recognition service and the returned text string is assigned to a channel variable. More info and dialplan examples can be found in the README file: https://raw.github.com/zaf/asterisk-speech-recog/master/README The script is available here: https://github.com/zaf/asterisk-speech-recog The code is still young and not roughly tested so comments, suggestions and bug reports are more than welcome. ---------------- Lefteris Zafiris
Bruce B
2012-Jan-04 05:43 UTC
[asterisk-users] Speech recognition in asterisk using google voice API
Very interesting. I just tried to get it to work but it complains about sox. Probably you used a different version of sox? *PBX-*CLI> /usr/bin/sox: invalid option -- -* */usr/bin/sox: invalid option -- n* */usr/bin/sox: invalid option -- o* */usr/bin/sox: -r must be given a positive integer* * -- speech-recog.agi: /usr/bin/sox failed: 512* I am using: *Package sox-12.18.1-1.el5_5.1.i386 * Thanks, On Tue, Jan 3, 2012 at 9:42 PM, Lefteris Zafiris <zaf.000 at gmail.com> wrote:> Hello, > I have written an agi script that uses google voice API for voice > recognition. > The script records from the current channel untill the pound key (#) is > pressed or the timeout (15 seconds) is reached. The recording is send > over to google speech recognition service and the returned text string > is assigned to a channel variable. > More info and dialplan examples can be found in the README file: > https://raw.github.com/zaf/asterisk-speech-recog/master/README > > The script is available here: > https://github.com/zaf/asterisk-speech-recog > > The code is still young and not roughly tested so comments, suggestions > and bug reports are more than welcome. > > ---------------- > Lefteris Zafiris > > -- > _____________________________________________________________________ > -- Bandwidth and Colocation Provided by http://www.api-digital.com -- > New to Asterisk? Join us for a live introductory webinar every Thurs: > http://www.asterisk.org/hello > > asterisk-users mailing list > To UNSUBSCRIBE or update options visit: > http://lists.digium.com/mailman/listinfo/asterisk-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.digium.com/pipermail/asterisk-users/attachments/20120104/1847fcc6/attachment.htm>
isrlgb at gmail.com
2012-Jan-04 18:27 UTC
[asterisk-users] Speech recognition in asterisk using google voice API
Does anyone know what languages are supported? -----Original Message----- From: Bruce B <bruceb444 at gmail.com> Sender: asterisk-users-bounces at lists.digium.com Date: Wed, 4 Jan 2012 13:25:18 To: Asterisk Users Mailing List - Non-Commercial Discussion<asterisk-users at lists.digium.com> Reply-To: Asterisk Users Mailing List - Non-Commercial Discussion <asterisk-users at lists.digium.com> Subject: Re: [asterisk-users] Speech recognition in asterisk using google voice API -- _____________________________________________________________________ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- New to Asterisk? Join us for a live introductory webinar every Thurs: http://www.asterisk.org/hello asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
Michelle Dupuis
2012-Jan-04 18:47 UTC
[asterisk-users] Speech recognition in asterisk using google voice API
Wow - nice! A few quick questions: 1. How long can the recording be for translation? 2. Any limitation on how much text the return (transcribed) variable can hold? 3. Any commercial / terms of use limitations? ________________________________ From: asterisk-users-bounces at lists.digium.com [asterisk-users-bounces at lists.digium.com] On Behalf Of Bruce B [bruceb444 at gmail.com] Sent: Wednesday, January 04, 2012 1:25 PM To: Asterisk Users List Subject: Re: [asterisk-users] Speech recognition in asterisk using google voice API Note to self: "Never release anything asterisk related without testing on RHEL/Centos 5" Thank you for reporting this. I have replaced sox with flac and it seems to work now on older platforms too (tested on Centos 5 with asterisk 1.4). You can get the updated code here: https://github.com/zaf/asterisk-speech-recog/tarball/master ---------------- Lefteris Zafiris Works beautifully. Amazing job Lefteris. Thanks. The best result I got in probability was 0.9725632 by saying, "hello". I think there is some non-phonetic logic built-in as well. I tried, "1, 2" and I got "0.86534226" in accuracy. While I tried "1, 2, 3, 4, 5" I got, "0.97256315". Probably Google sees the pattern?! What are some of the other tricks (if any) or consideration that one should make while creating a strong speech recognition enabled IVR? Best, -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.digium.com/pipermail/asterisk-users/attachments/20120104/3ec49f55/attachment.htm>
Lefteris Zafiris
2012-Jan-04 19:26 UTC
[asterisk-users] Speech recognition in asterisk using google voice API
> > Works beautifully. Amazing job?Lefteris. Thanks. > > The best result I got in probability was 0.9725632 by saying, "hello". I > think there is some non-phonetic logic built-in as well. I tried, "1, 2" and > I got "0.86534226" in accuracy. While I tried "1, 2, 3, 4, 5" I got, > "0.97256315". Probably Google sees the pattern?! > > What are some of the other tricks (if any) or consideration that one should > make while creating a strong speech recognition enabled IVR?Google accepts sound files at any sampling rate (up to 44.1kHz) so if you can use some wideband codec ( eg g722) It can greatly improve the sound quality and the detection rates. For now the script supports 8kHz and 16kHz sampling rates for recording and it can be set by editing the scripts user defined parameters ( the variable $samplerate). Anything that improves the recording sound clarity will help, a good phone, low background noise level etc. I have also read that normalizing the recording and setting the gain to -5 db improves detection rates. I m experimenting with this at the moment and there will be some new code soon (as soon as i get sox working in RHEL/Centos 5 :P ). ---------------- Lefteris Zafiris