Matthew John Darnell
2003-Jul-15 18:41 UTC
[Asterisk-Users] Text to Speech - Someone needs to do this
Why hasn't someone found 50 people who sound alike, put them in sound studios and record the 10,000 most commonly used words. You would all differnent forms of the 1,000 most words, i.e. leading, trailing, question etc. You can synthesize the other 0.05% when you run into them. With hard drives so big, processors so fast and EXT3 that can handle 30,000+ files in a single directory that seems like the way to do it. You could sell it for BIG bucks. -Matt
Steve Underwood
2003-Jul-15 20:57 UTC
[Asterisk-Users] Text to Speech - Someone needs to do this
Matthew John Darnell wrote:>Why hasn't someone found 50 people who sound alike, put them in sound >studios and record the 10,000 most commonly used words. You would all >differnent forms of the 1,000 most words, i.e. leading, trailing, question >etc. > >You can synthesize the other 0.05% when you run into them. With hard drives >so big, processors so fast and EXT3 that can handle 30,000+ files in a >single directory that seems like the way to do it. > >You could sell it for BIG bucks. > >People have done this. The results are terrible. You couldn't charge big bucks. You'd have trouble giving it away. Regards, Steve
Chris Albertson
2003-Jul-15 23:04 UTC
[Asterisk-Users] Text to Speech - Someone needs to do this
People working on this have found that context influences the pronounciation of words. I think the root cause of this is that the human vocal tract cannot re-shape itself for different sounds instantly and must move from the previous sound to the next sound, we hear the movement. If it does instantly change then we hear it as un-natural robot-like speach. Your proposed system would sound just like what it is, a sequence of words. Good systems not only look at phonetic context but also inflection like tone, volume and pitch range and speed. Cursive hand writting is this way too. Cursive fonts don't look like real hand writting because each letter is always the same --- Matthew John Darnell <mdarnell@servpac.com> wrote:> Why hasn't someone found 50 people who sound alike, put them in sound > studios and record the 10,000 most commonly used words. You would > all > differnent forms of the 1,000 most words, i.e. leading, trailing, > question > etc. > > You can synthesize the other 0.05% when you run into them. With hard > drives > so big, processors so fast and EXT3 that can handle 30,000+ files in > a > single directory that seems like the way to do it. > > You could sell it for BIG bucks. > > -Matt > > _______________________________________________ > Asterisk-Users mailing list > Asterisk-Users@lists.digium.com > http://lists.digium.com/mailman/listinfo/asterisk-users====Chris Albertson Home: 310-376-1029 chrisalbertson90278@yahoo.com Cell: 310-990-7550 Office: 310-336-5189 Christopher.J.Albertson@aero.org KG6OMK __________________________________ Do you Yahoo!? SBC Yahoo! DSL - Now only $29.95 per month! http://sbc.yahoo.com
Moshe Yudkowsky
2003-Jul-16 06:32 UTC
[Asterisk-Users] Text to Speech - Someone needs to do this
At 15:41 2003-07-15 -1000, Matthew John Darnell wrote:>Why hasn't someone found 50 people who sound alike, put them in sound >studios and record the 10,000 most commonly used words. You would all >differnent forms of the 1,000 most words, i.e. leading, trailing, question >etc. > >You can synthesize the other 0.05% when you run into them. With hard drives >so big, processors so fast and EXT3 that can handle 30,000+ files in a >single directory that seems like the way to do it. > >You could sell it for BIG bucks.Text-to-Speech (TTS) is usually either "formative," created by synthesis of sounds; or concatenative, created by concatenating sounds of actual speech samples. However, concatenative TTS usually works by using small fragments of speech, not entire words. The storage requirements are much smaller, and it gives the system an opportunity to pick units of speech that match the units of speech that precede and follow them. The real trick is to get the correct posidy. Here's three sentences with the same words but each with different prosidy: "I said 'yes.' "I said yes?" "_I_ said '_yes_'"???!! Both formative and concatenative systems add prosidy. Adding prosidy to whole-word concatentative systems is difficult. If you're in a buying mood, there are some excellent TTS systems available. For example, Rhetorical (http://www.rhetorical.com) has some excellent voices. And they have the funniest TTS current available is the "Southern California female" voice; I use it for non-serious demos ("That's so totally awesome.") Commercial TTS is actually very intelligble and perfectly adequate for many tasks. -- Moshe Yudkowsky Disaggregate 2952 W Fargo Chicago, IL 60645 USA www.Disaggregate.com speech@pobox.com +1 773 764 8727
In cdr table or in /var/log/asterisk/cdr-csv/Master.csv srsergio -----Mensaje original----- De: asterisk-users-admin@lists.digium.com [mailto:asterisk-users-admin@lists.digium.com] En nombre de isamar@isamarmaia.org Enviado el: mi?rcoles, 16 de julio de 2003 23:54 Para: asterisk-users@lists.digium.com Asunto: [Asterisk-Users] Dial SessionTime Hi Folks, After a successful Dial/H323, is there any to get the conversation duration time? Thanks, Isamar _______________________________________________ Asterisk-Users mailing list Asterisk-Users@lists.digium.com http://lists.digium.com/mailman/listinfo/asterisk-users