Jonathan H
2017-Dec-06 14:33 UTC
[asterisk-users] Simple speech recognition for driving IVR - "press or say one".
Briefly: I want to be able to have "press or say (number)", with Asterisk listening for a spoken number, but accepting a DTMF digit, too. I'm posting everything I found so far, here, partly to show working, but also in case anyone else finds it useful. So, moving on.... This looked hopeful for a moment until I realised that it doesn't do DTMF: https://wiki.asterisk.org/wiki/display/AST/Asterisk+15+Application_SpeechBackground So then there's https://wiki.asterisk.org/wiki/display/AST/Asterisk+15+Application_Record, which can terminate on any DTMF key with "y", but according to the docs, "RECORD_STATUS" only sets a flag of "DTMF" (A terminating DTMF was received ('#' or '*', depending upon option 't')). So, I don't get to know which key was pressed via that method, either. There's very little information I can find about the built-in functions for speech recognition. https://wiki.asterisk.org/wiki/display/AST/Speech+Recognition+API doesn't actually explain how to integrate the actual speech engines. In this previous forum post, https://community.asterisk.org/t/asterisk-15-jack-streams-speech-recognition-so-many-questions/72108/2 , jcolp explained that most people don't use the speech interface anyway, because "Asterisk modules are written in C, and it?s more difficult to do things in that fashion. Using the Record and ship it off using Python, etc, is just easier and gets the job done for a lot of people to where they find it acceptable. So, AGI it is! But I'm still stuck on how I record for speech AND get a DTMF if it was dialled. Regarding speech in general, even "Asterisk - The Definitive Guide" just says: "Asterisk does not have speech recognition built in, but there are many third-party speech recognition packages that integrate with Asterisk. Much of that is outside of the scope of this book, as those applications are external to Asterisk" - helpful! The speech-rec mailing list at http://lists.digium.com/pipermail/asterisk-speech-rec/ hasn't been posted to since 2013 Someone else asked about speech recognition and unimrcp in this post: http://lists.digium.com/pipermail/asterisk-users/2017-February/290875.html uniMCRP https://mojolingo.com/blog/2015/speech-rec-asterisk-get-started/ http://www.unimrcp.org/manuals/html/AsteriskManual.html#_Toc424230605 This has a Google Speech Recogniser plugin, but it's $50 per channel http://www.unimrcp.org/gsr *Reasons to use Lex over Google TTS* ? Has just been released in eu-west-1: https://forums.aws.amazon.com/ann.jspa?annID=5186 ? Supports 8KHz telepony https://forums.aws.amazon.com/ann.jspa?annID=4775 ? Is in the core AWS SDK http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/LexRuntime.html ? Has a number slot type: http://docs.aws.amazon.com/lex/latest/dg/built-in-slot-number.html - this means no accidental recognition of "won", "one" or "juan" instead of 1! The pricing is definitely right: "The cost for 1,000 speech requests would be $4.00, and 1,000 text requests would cost $0.75. From the date you get started with Amazon Lex, you can process up to 10,000 text requests and 5,000 speech requests per month for free for the first year". Amazon Transcribe looks promising too, but is only available for developer invitation at this time: https://aws.amazon.com/transcribe/ https://aws.amazon.com/transcribe/pricing/ But all I need now is the quickest, simplest way to send Lex a short 8KHz file and get a single digit back, as quickly and reliably as possible. Before I travel too far down this road, can someone point me in the right direction and possibly steer me away from the wrong path?!
Jurijs Ivolga
2017-Dec-06 14:43 UTC
[asterisk-users] Simple speech recognition for driving IVR - "press or say one".
Hi, I was able to achieve this using: Jurijs On Wed, Dec 6, 2017 at 4:33 PM, Jonathan H <lardconcepts at gmail.com> wrote:> Briefly: I want to be able to have "press or say (number)", with > Asterisk listening for a spoken number, but accepting a DTMF digit, > too. > > I'm posting everything I found so far, here, partly to show working, > but also in case anyone else finds it useful. So, moving on.... > > This looked hopeful for a moment until I realised that it doesn't do DTMF: > https://wiki.asterisk.org/wiki/display/AST/Asterisk+15+ > Application_SpeechBackground > > So then there's > https://wiki.asterisk.org/wiki/display/AST/Asterisk+15+Application_Record, > which can terminate on any DTMF key with "y", but according to the > docs, "RECORD_STATUS" only sets a flag of "DTMF" (A terminating DTMF > was received ('#' or '*', depending upon option 't')). > So, I don't get to know which key was pressed via that method, either. > > There's very little information I can find about the built-in > functions for speech recognition. > https://wiki.asterisk.org/wiki/display/AST/Speech+Recognition+API > doesn't actually explain how to integrate the actual speech engines. > > In this previous forum post, > https://community.asterisk.org/t/asterisk-15-jack- > streams-speech-recognition-so-many-questions/72108/2 > , jcolp explained that most people don't use the speech interface > anyway, because > "Asterisk modules are written in C, and it?s more difficult to do > things in that fashion. Using the Record and ship it off using Python, > etc, is just easier and gets the job done for a lot of people to where > they find it acceptable. > So, AGI it is! But I'm still stuck on how I record for speech AND get > a DTMF if it was dialled. > > Regarding speech in general, even "Asterisk - The Definitive Guide" just > says: > > "Asterisk does not have speech recognition built in, but there are > many third-party speech > recognition packages that integrate with Asterisk. Much of that is > outside of the scope > of this book, as those applications are external to Asterisk" - helpful! > > The speech-rec mailing list at > http://lists.digium.com/pipermail/asterisk-speech-rec/ hasn't been > posted to since 2013 > > Someone else asked about speech recognition and unimrcp in this post: > http://lists.digium.com/pipermail/asterisk-users/2017-February/290875.html > > uniMCRP https://mojolingo.com/blog/2015/speech-rec-asterisk-get-started/ > http://www.unimrcp.org/manuals/html/AsteriskManual.html#_Toc424230605 > This has a Google Speech Recogniser plugin, but it's $50 per channel > http://www.unimrcp.org/gsr > > *Reasons to use Lex over Google TTS* > ? Has just been released in eu-west-1: > https://forums.aws.amazon.com/ann.jspa?annID=5186 > ? Supports 8KHz telepony https://forums.aws.amazon.com/ann.jspa?annID=4775 > ? Is in the core AWS SDK > http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/LexRuntime.html > ? Has a number slot type: > http://docs.aws.amazon.com/lex/latest/dg/built-in-slot-number.html > - this means no accidental recognition of "won", "one" or "juan" instead > of 1! > > The pricing is definitely right: "The cost for 1,000 speech requests > would be $4.00, and 1,000 text requests would cost $0.75. From the > date you get started with Amazon Lex, you can process up to 10,000 > text requests and 5,000 speech requests per month for free for the > first year". > > Amazon Transcribe looks promising too, but is only available for > developer invitation at this time: > https://aws.amazon.com/transcribe/ https://aws.amazon.com/ > transcribe/pricing/ > > But all I need now is the quickest, simplest way to send Lex a short > 8KHz file and get a single digit back, as quickly and reliably as > possible. > > Before I travel too far down this road, can someone point me in the > right direction and possibly steer me away from the wrong path?! > > -- > _____________________________________________________________________ > -- Bandwidth and Colocation Provided by http://www.api-digital.com -- > > Check out the new Asterisk community forum at: https://community.asterisk. > org/ > > New to Asterisk? Start here: > https://wiki.asterisk.org/wiki/display/AST/Getting+Started > > asterisk-users mailing list > To UNSUBSCRIBE or update options visit: > http://lists.digium.com/mailman/listinfo/asterisk-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.digium.com/pipermail/asterisk-users/attachments/20171206/fae37487/attachment.html>
Jurijs Ivolga
2017-Dec-06 14:46 UTC
[asterisk-users] Simple speech recognition for driving IVR - "press or say one".
Hi, I was able to achieve this using: http://zaf.github.io/asterisk-speech-recog/ I needed to change code, so it wasn't working out-of-the-box. I did this couple of years ago and unfortunately I do not have code anymore. But it wasn't too difficult. With kind regards, Jurijs On Wed, Dec 6, 2017 at 4:33 PM, Jonathan H <lardconcepts at gmail.com> wrote:> Briefly: I want to be able to have "press or say (number)", with > Asterisk listening for a spoken number, but accepting a DTMF digit, > too. > > I'm posting everything I found so far, here, partly to show working, > but also in case anyone else finds it useful. So, moving on.... > > This looked hopeful for a moment until I realised that it doesn't do DTMF: > https://wiki.asterisk.org/wiki/display/AST/Asterisk+15+ > Application_SpeechBackground > > So then there's > https://wiki.asterisk.org/wiki/display/AST/Asterisk+15+Application_Record, > which can terminate on any DTMF key with "y", but according to the > docs, "RECORD_STATUS" only sets a flag of "DTMF" (A terminating DTMF > was received ('#' or '*', depending upon option 't')). > So, I don't get to know which key was pressed via that method, either. > > There's very little information I can find about the built-in > functions for speech recognition. > https://wiki.asterisk.org/wiki/display/AST/Speech+Recognition+API > doesn't actually explain how to integrate the actual speech engines. > > In this previous forum post, > https://community.asterisk.org/t/asterisk-15-jack- > streams-speech-recognition-so-many-questions/72108/2 > , jcolp explained that most people don't use the speech interface > anyway, because > "Asterisk modules are written in C, and it?s more difficult to do > things in that fashion. Using the Record and ship it off using Python, > etc, is just easier and gets the job done for a lot of people to where > they find it acceptable. > So, AGI it is! But I'm still stuck on how I record for speech AND get > a DTMF if it was dialled. > > Regarding speech in general, even "Asterisk - The Definitive Guide" just > says: > > "Asterisk does not have speech recognition built in, but there are > many third-party speech > recognition packages that integrate with Asterisk. Much of that is > outside of the scope > of this book, as those applications are external to Asterisk" - helpful! > > The speech-rec mailing list at > http://lists.digium.com/pipermail/asterisk-speech-rec/ hasn't been > posted to since 2013 > > Someone else asked about speech recognition and unimrcp in this post: > http://lists.digium.com/pipermail/asterisk-users/2017-February/290875.html > > uniMCRP https://mojolingo.com/blog/2015/speech-rec-asterisk-get-started/ > http://www.unimrcp.org/manuals/html/AsteriskManual.html#_Toc424230605 > This has a Google Speech Recogniser plugin, but it's $50 per channel > http://www.unimrcp.org/gsr > > *Reasons to use Lex over Google TTS* > ? Has just been released in eu-west-1: > https://forums.aws.amazon.com/ann.jspa?annID=5186 > ? Supports 8KHz telepony https://forums.aws.amazon.com/ann.jspa?annID=4775 > ? Is in the core AWS SDK > http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/LexRuntime.html > ? Has a number slot type: > http://docs.aws.amazon.com/lex/latest/dg/built-in-slot-number.html > - this means no accidental recognition of "won", "one" or "juan" instead > of 1! > > The pricing is definitely right: "The cost for 1,000 speech requests > would be $4.00, and 1,000 text requests would cost $0.75. From the > date you get started with Amazon Lex, you can process up to 10,000 > text requests and 5,000 speech requests per month for free for the > first year". > > Amazon Transcribe looks promising too, but is only available for > developer invitation at this time: > https://aws.amazon.com/transcribe/ https://aws.amazon.com/ > transcribe/pricing/ > > But all I need now is the quickest, simplest way to send Lex a short > 8KHz file and get a single digit back, as quickly and reliably as > possible. > > Before I travel too far down this road, can someone point me in the > right direction and possibly steer me away from the wrong path?! > > -- > _____________________________________________________________________ > -- Bandwidth and Colocation Provided by http://www.api-digital.com -- > > Check out the new Asterisk community forum at: https://community.asterisk. > org/ > > New to Asterisk? Start here: > https://wiki.asterisk.org/wiki/display/AST/Getting+Started > > asterisk-users mailing list > To UNSUBSCRIBE or update options visit: > http://lists.digium.com/mailman/listinfo/asterisk-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.digium.com/pipermail/asterisk-users/attachments/20171206/1e4d14c4/attachment.html>
Jonathan H
2017-Dec-06 14:50 UTC
[asterisk-users] Simple speech recognition for driving IVR - "press or say one".
Thanks Jurijs, Yes, in fact I'm already using that, and it works fine. The problem here is that I cannot find a way of recording speech AND listening for a DTMF digit being pressed as an alternative. That's where the problem lies. J.
Dan Cropp
2017-Dec-06 22:23 UTC
[asterisk-users] Simple speech recognition for driving IVR - "press or say one".
UniMRCP with one of the various speech recognition providers they support definitely works for this. Specify multiple grammars in the MRCP call. One for text to listen for. Another for the DTMFs to listen for. The results will indicate which grammar and what was detected. The combination of voice and/or DTMFs is exactly what speech recognition has been designed for. I am very pleased with UniMRCP and the support they have given us. From: asterisk-users-bounces at lists.digium.com [mailto:asterisk-users-bounces at lists.digium.com] On Behalf Of Jurijs Ivolga Sent: Wednesday, December 06, 2017 8:44 AM To: Asterisk Users Mailing List - Non-Commercial Discussion Subject: Re: [asterisk-users] Simple speech recognition for driving IVR - "press or say one". Hi, I was able to achieve this using: Jurijs On Wed, Dec 6, 2017 at 4:33 PM, Jonathan H <lardconcepts at gmail.com<mailto:lardconcepts at gmail.com>> wrote: Briefly: I want to be able to have "press or say (number)", with Asterisk listening for a spoken number, but accepting a DTMF digit, too. I'm posting everything I found so far, here, partly to show working, but also in case anyone else finds it useful. So, moving on.... This looked hopeful for a moment until I realised that it doesn't do DTMF: https://wiki.asterisk.org/wiki/display/AST/Asterisk+15+Application_SpeechBackground So then there's https://wiki.asterisk.org/wiki/display/AST/Asterisk+15+Application_Record, which can terminate on any DTMF key with "y", but according to the docs, "RECORD_STATUS" only sets a flag of "DTMF" (A terminating DTMF was received ('#' or '*', depending upon option 't')). So, I don't get to know which key was pressed via that method, either. There's very little information I can find about the built-in functions for speech recognition. https://wiki.asterisk.org/wiki/display/AST/Speech+Recognition+API doesn't actually explain how to integrate the actual speech engines. In this previous forum post, https://community.asterisk.org/t/asterisk-15-jack-streams-speech-recognition-so-many-questions/72108/2 , jcolp explained that most people don't use the speech interface anyway, because "Asterisk modules are written in C, and it?s more difficult to do things in that fashion. Using the Record and ship it off using Python, etc, is just easier and gets the job done for a lot of people to where they find it acceptable. So, AGI it is! But I'm still stuck on how I record for speech AND get a DTMF if it was dialled. Regarding speech in general, even "Asterisk - The Definitive Guide" just says: "Asterisk does not have speech recognition built in, but there are many third-party speech recognition packages that integrate with Asterisk. Much of that is outside of the scope of this book, as those applications are external to Asterisk" - helpful! The speech-rec mailing list at http://lists.digium.com/pipermail/asterisk-speech-rec/ hasn't been posted to since 2013 Someone else asked about speech recognition and unimrcp in this post: http://lists.digium.com/pipermail/asterisk-users/2017-February/290875.html uniMCRP https://mojolingo.com/blog/2015/speech-rec-asterisk-get-started/ http://www.unimrcp.org/manuals/html/AsteriskManual.html#_Toc424230605 This has a Google Speech Recogniser plugin, but it's $50 per channel http://www.unimrcp.org/gsr *Reasons to use Lex over Google TTS* ? Has just been released in eu-west-1: https://forums.aws.amazon.com/ann.jspa?annID=5186 ? Supports 8KHz telepony https://forums.aws.amazon.com/ann.jspa?annID=4775 ? Is in the core AWS SDK http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/LexRuntime.html ? Has a number slot type: http://docs.aws.amazon.com/lex/latest/dg/built-in-slot-number.html - this means no accidental recognition of "won", "one" or "juan" instead of 1! The pricing is definitely right: "The cost for 1,000 speech requests would be $4.00, and 1,000 text requests would cost $0.75. From the date you get started with Amazon Lex, you can process up to 10,000 text requests and 5,000 speech requests per month for free for the first year". Amazon Transcribe looks promising too, but is only available for developer invitation at this time: https://aws.amazon.com/transcribe/ https://aws.amazon.com/transcribe/pricing/ But all I need now is the quickest, simplest way to send Lex a short 8KHz file and get a single digit back, as quickly and reliably as possible. Before I travel too far down this road, can someone point me in the right direction and possibly steer me away from the wrong path?! -- _____________________________________________________________________ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- Check out the new Asterisk community forum at: https://community.asterisk.org/ New to Asterisk? Start here: https://wiki.asterisk.org/wiki/display/AST/Getting+Started asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.digium.com/pipermail/asterisk-users/attachments/20171206/a95f6003/attachment.html>
Maybe Matching Threads
- Simple speech recognition for driving IVR - "press or say one".
- Simple speech recognition for driving IVR - "press or say one".
- Automatic Speech Recognition and Text To Speech using iSpeech
- What SW/HW phones support sendtext feature (trying to send speech recognition results back to user)?
- Lumenvox speech recognition