Tom Grandgent wrote:> Have you tried using 16kHz wideband? The sound quality is far superior to > narrowband, IMO, even if you have to turn the VBR quality down (to say, 2) > to save bandwidth.Thanks for the info Tom! Probably narrowband is hurting me, but my system is currently built on that. I want to try to get acceptable performance from narrowband if possible before trying to add wideband support to the system. Consistant, noise-free synthesized speech certainly has fewer potentially problematic aspects to it though, I agree... Maybe we should all just make people type their communications in to a keyboard :) Reed> > Tom > > Reed Hedges <reed@mobilerobots.com> wrote: >> >> Hello, I'm wondering if anyone has done this before and has any advice, or if >> anyone in general has ideas about it. >> >> I just implemented transmitting synthesized speech (text-to-speech) over Speex >> (narrowband) in an application. I'm using Swift from Cepstral >> (http://www.cepstral.com). The voice I'm using is a pretty deep male voice. I'm >> telling Swift to generate audio at 8khz, then encoding each chunk of audio >> output by Swift and sending it to a client. >> >> One interesting thing I've noticed is that as I increase Speex's encoding >> quality, the output in the client sounds smoother (at my normal quality value of >> 5 or 6 it sounds OK but occasionally has a hesitation or glitch) but "thinner" >> -- less full or less resolution. Using the noise filter and changing the >> complexity parameter don't seem to matter. >> >> I'll be experimenting with this more, but if anyone is interested I can send >> some audio data generated by the Swift synthesizer. Or, if anyone has any >> suggestions for how to tweak the synthesized audio for better encoding by Speex, >> that would also be helpful (I don't know very much about audio or audio signal >> processing yet.) >> >> Thanks! >> >> Reed >
Tom Grandgent
2006-May-26 12:44 UTC
[Speex-dev] Transmitting synthetic speech using Speex?
Hi Reed, I've been using Speex to transmit TTS for years. It works very well with no tweaking. I use Microsoft TTS ("Microsoft Mike") with Speex at 16kHz wideband and VBR quality 6. Sometimes I forget that the sound is even coming from another computer and being compressed+decompressed. If anything, TTS seems easier for Speex to deal with than real voice. But I don't have any data to back up this impression. Have you tried using 16kHz wideband? The sound quality is far superior to narrowband, IMO, even if you have to turn the VBR quality down (to say, 2) to save bandwidth. Tom Reed Hedges <reed@mobilerobots.com> wrote:> > > Hello, I'm wondering if anyone has done this before and has any advice, or if > anyone in general has ideas about it. > > I just implemented transmitting synthesized speech (text-to-speech) over Speex > (narrowband) in an application. I'm using Swift from Cepstral > (http://www.cepstral.com). The voice I'm using is a pretty deep male voice. I'm > telling Swift to generate audio at 8khz, then encoding each chunk of audio > output by Swift and sending it to a client. > > One interesting thing I've noticed is that as I increase Speex's encoding > quality, the output in the client sounds smoother (at my normal quality value of > 5 or 6 it sounds OK but occasionally has a hesitation or glitch) but "thinner" > -- less full or less resolution. Using the noise filter and changing the > complexity parameter don't seem to matter. > > I'll be experimenting with this more, but if anyone is interested I can send > some audio data generated by the Swift synthesizer. Or, if anyone has any > suggestions for how to tweak the synthesized audio for better encoding by Speex, > that would also be helpful (I don't know very much about audio or audio signal > processing yet.) > > Thanks! > > Reed