thr3ads.net - asterisk users - [Asterisk-Users] Text to Speech

If this information is useful, please help other people find it:
Share via:

Matthew John Darnell

2003-Jul-15 18:41 UTC

[Asterisk-Users] Text to Speech - Someone needs to do this

Why hasn't someone found 50 people who sound alike, put them in sound
studios and record the 10,000 most commonly used words.  You would all
differnent forms of the 1,000 most words, i.e. leading, trailing, question
etc.

You can synthesize the other 0.05% when you run into them.  With hard drives
so big, processors so fast and EXT3 that can handle 30,000+ files in a
single directory that seems like the way to do it.

You could sell it for BIG bucks.

-Matt

Steve Underwood

2003-Jul-15 20:57 UTC

head link

[Asterisk-Users] Text to Speech - Someone needs to do this

Matthew John Darnell wrote:
>Why hasn't someone found 50 people who sound alike, put them in sound
>studios and record the 10,000 most commonly used words.  You would all
>differnent forms of the 1,000 most words, i.e. leading, trailing, question
>etc.
>
>You can synthesize the other 0.05% when you run into them.  With hard drives
>so big, processors so fast and EXT3 that can handle 30,000+ files in a
>single directory that seems like the way to do it.
>
>You could sell it for BIG bucks.
>  
>People have done this. The results are terrible. You couldn't charge big 
bucks. You'd have trouble giving it away.

Regards,
Steve

Chris Albertson

2003-Jul-15 23:04 UTC

head link

[Asterisk-Users] Text to Speech - Someone needs to do this

People working on this have found that context influences the
pronounciation of words.  I think the root cause of this is
that the human vocal tract cannot re-shape itself for different
sounds instantly and must move from the previous sound to the next
sound, we hear the movement.  If it does instantly change then
we hear it as un-natural robot-like speach.  Your proposed system
would sound just like what it is, a sequence of words.
Good systems not only look at phonetic context but also
inflection like tone, volume and pitch range and speed.

Cursive hand writting is this way too.  Cursive fonts don't
look like real hand writting because each letter is always
the same

--- Matthew John Darnell <mdarnell@servpac.com>
wrote:> Why hasn't someone found 50 people who sound alike, put them in sound
> studios and record the 10,000 most commonly used words.  You would
> all
> differnent forms of the 1,000 most words, i.e. leading, trailing,
> question
> etc.
> 
> You can synthesize the other 0.05% when you run into them.  With hard
> drives
> so big, processors so fast and EXT3 that can handle 30,000+ files in
> a
> single directory that seems like the way to do it.
> 
> You could sell it for BIG bucks.
> 
> -Matt
> 
> _______________________________________________
> Asterisk-Users mailing list
> Asterisk-Users@lists.digium.com
> http://lists.digium.com/mailman/listinfo/asterisk-users

====Chris Albertson
  Home:   310-376-1029  chrisalbertson90278@yahoo.com
  Cell:   310-990-7550
  Office: 310-336-5189  Christopher.J.Albertson@aero.org
  KG6OMK

__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com

Moshe Yudkowsky

2003-Jul-16 06:32 UTC

head link

[Asterisk-Users] Text to Speech - Someone needs to do this

At 15:41 2003-07-15 -1000, Matthew John Darnell wrote:>Why hasn't someone found 50 people who sound alike, put them in sound
>studios and record the 10,000 most commonly used words.  You would all
>differnent forms of the 1,000 most words, i.e. leading, trailing, question
>etc.
>
>You can synthesize the other 0.05% when you run into them.  With hard drives
>so big, processors so fast and EXT3 that can handle 30,000+ files in a
>single directory that seems like the way to do it.
>
>You could sell it for BIG bucks.
Text-to-Speech (TTS) is usually either "formative," created by
synthesis of
sounds; or concatenative, created by concatenating sounds of actual speech 
samples.

However, concatenative TTS usually works by using small fragments of 
speech, not entire words. The storage requirements are much smaller, and it 
gives the system an opportunity to pick units of speech that match the 
units of speech that precede and follow them.

The real trick is to get the correct posidy. Here's three sentences with 
the same words but each with different prosidy:

"I said 'yes.'

"I said yes?"

"_I_ said '_yes_'"???!!

Both formative and concatenative systems add prosidy. Adding prosidy to 
whole-word concatentative systems is difficult.

If you're in a buying mood, there are some excellent TTS systems available. 
For example, Rhetorical (http://www.rhetorical.com) has some excellent 
voices. And they have the funniest TTS current available is the "Southern 
California female" voice; I use it for non-serious demos ("That's
so
totally awesome.")

Commercial TTS is actually very intelligble and perfectly adequate for many 
tasks.

-- 
  Moshe Yudkowsky
  Disaggregate
  2952 W Fargo
  Chicago, IL 60645 USA

  www.Disaggregate.com
  speech@pobox.com
  +1 773 764 8727

Sergio Serrano Revuelto

2003-Jul-16 07:22 UTC

head link

[Asterisk-Users] Dial SessionTime

In cdr table or in /var/log/asterisk/cdr-csv/Master.csv

srsergio




-----Mensaje original-----
De: asterisk-users-admin@lists.digium.com
[mailto:asterisk-users-admin@lists.digium.com] En nombre de
isamar@isamarmaia.org
Enviado el: mi?rcoles, 16 de julio de 2003 23:54
Para: asterisk-users@lists.digium.com
Asunto: [Asterisk-Users] Dial SessionTime



Hi Folks,

After a successful Dial/H323,
is there any to get the conversation duration time?


Thanks,

Isamar


_______________________________________________
Asterisk-Users mailing list
Asterisk-Users@lists.digium.com
http://lists.digium.com/mailman/listinfo/asterisk-users

isamar@isamarmaia.org

2003-Jul-16 14:54 UTC

head link

[Asterisk-Users] Dial SessionTime

Hi Folks,

After a successful Dial/H323,
is there any to get the conversation duration time?


Thanks,

Isamar

Reasonably Related Threads

Search for more seemingly similar threads

asterisk users - Jul 2003 - Text to Speech - Someone needs to do this

[Asterisk-Users] Text to Speech - Someone needs to do this

[Asterisk-Users] Text to Speech - Someone needs to do this

[Asterisk-Users] Text to Speech - Someone needs to do this

[Asterisk-Users] Text to Speech - Someone needs to do this

[Asterisk-Users] Dial SessionTime

[Asterisk-Users] Dial SessionTime

Reasonably Related Threads