thr3ads.net - Speex dev - [speex-dev] Removing silence at the start and end of sample encoded [Aug 2004]

If this information is useful, please help other people find it:
Share via:

Asger Kunuk Alstrup

2004-Aug-06 15:01 UTC

[speex-dev] Removing silence at the start and end of sample encoded

Hi,

Speex is great!

We are using it to compress hundreds of megabytes of speech for use in our
application that trains people in resuscitation. The previous version of our
product used Ogg Vorbis, but after switching to Speex, we achieve fantastic
compression, while retaining super quality. That allows us to cram more
translated versions of the software onto each CD-ROM, making everything better.

However, we have a small feature requests for speexenc that would make things
perfect for us: Support for trimming the start and end of the sound away, during
the encoding process.

This feature request consists of two things:

1) support for automatic detection and trimming of silence at the start and end
of the sound out.

2) support for intentionally skipping the very first and very last parts of the
input WAV, even if it is not silence.

This is because we record a lot of small sentences using a recording tool on the
PC - we are talking hundreds of sentences. Every time the actor records a sound,
he has to press space first on the keyboard, wait a bit, read the text, and then
click space again to stop recording when he is done reading it. Now, we can
often hear the release of the space key at the start of the sample, and the
pressing of space bar at the end of the sample.

So, our samples all look like this:

  "Noise---Silence---Noise-speech-with-wrong-gain---Silence---Noise"

We would like to automatically get this after feeding the WAVE file to speexenc:

  "Speech-without-noise-and-correct-gain"

with the leading noise and silence trimmed out. Speexenc already supports
automatic gain control and noise reduction, so all we need is the trimming of
the start and end, of both the noise and silence. Of course, the silence part
varies in length from sample to sample, but we can cap the noise to X
milliseconds.

As it is today, we have to manually edit these samples to remove the noise, and
then Cooledit can batch-process the silence out for us. After that, we use
speexenc to get the final sample, but it should be possible to do this in one
operation, making everything simpler.

If you can not implement this for us, maybe you could give a few hints to where
we should look to implement this feature? We are using the CVS version of Speex.

Thanks in advance,

Asger Ottar Alstrup

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'speex-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Jean-Marc Valin

2004-Aug-06 15:01 UTC

head link

[speex-dev] Removing silence at the start and end of sample encoded

Actually, there are probably some batch programs that could do the job.
It's definitely not a job for speexenc, which I'd like to keep simple.

        Jean-Marc

Le ven 09/01/2004 à 17:37, Asger Kunuk Alstrup a écrit :> Hi,
> 
> Speex is great!
> 
> We are using it to compress hundreds of megabytes of speech for use in our
> application that trains people in resuscitation. The previous version of
our
> product used Ogg Vorbis, but after switching to Speex, we achieve fantastic
> compression, while retaining super quality. That allows us to cram more
> translated versions of the software onto each CD-ROM, making everything
better.
> 
> However, we have a small feature requests for speexenc that would make
things
> perfect for us: Support for trimming the start and end of the sound away,
during
> the encoding process.
> 
> This feature request consists of two things:
> 
> 1) support for automatic detection and trimming of silence at the start and
end
> of the sound out.
> 
> 2) support for intentionally skipping the very first and very last parts of
the
> input WAV, even if it is not silence.
> 
> This is because we record a lot of small sentences using a recording tool
on the
> PC - we are talking hundreds of sentences. Every time the actor records a
sound,
> he has to press space first on the keyboard, wait a bit, read the text, and
then
> click space again to stop recording when he is done reading it. Now, we can
> often hear the release of the space key at the start of the sample, and the
> pressing of space bar at the end of the sample.
> 
> So, our samples all look like this:
> 
>  
"Noise---Silence---Noise-speech-with-wrong-gain---Silence---Noise"
> 
> We would like to automatically get this after feeding the WAVE file to
speexenc:
> 
>   "Speech-without-noise-and-correct-gain"
> 
> with the leading noise and silence trimmed out. Speexenc already supports
> automatic gain control and noise reduction, so all we need is the trimming
of
> the start and end, of both the noise and silence. Of course, the silence
part
> varies in length from sample to sample, but we can cap the noise to X
> milliseconds.
> 
> As it is today, we have to manually edit these samples to remove the noise,
and
> then Cooledit can batch-process the silence out for us. After that, we use
> speexenc to get the final sample, but it should be possible to do this in
one
> operation, making everything simpler.
> 
> If you can not implement this for us, maybe you could give a few hints to
where
> we should look to implement this feature? We are using the CVS version of
Speex.
> 
> Thanks in advance,
> 
> Asger Ottar Alstrup
> 
> --- >8 ----
> List archives:  http://www.xiph.org/archives/
> Ogg project homepage: http://www.xiph.org/ogg/
> To unsubscribe from this list, send a message to
'speex-dev-request@xiph.org'
> containing only the word 'unsubscribe' in the body.  No subject is
needed.
> Unsubscribe messages sent to the list will be ignored/filtered.
-- 
Jean-Marc Valin, M.Sc.A., ing. jr.
LABORIUS (http://www.gel.usherb.ca/laborius)
Université de Sherbrooke, Québec, Canada


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 190 bytes
Desc: Ceci est une partie de message numériquement signée.
Url :
http://lists.xiph.org/pipermail/speex-dev/attachments/20040110/2244d57e/signature-0001.pgp

Anders S. Johansen

2004-Aug-06 15:01 UTC

head link

[speex-dev] Removing silence at the start and end of sample encoded

Asger Kunuk Alstrup wrote:> 1) support for automatic detection and trimming of silence at the start and
end
> of the sound out.
> 
> 2) support for intentionally skipping the very first and very last parts of
the
> input WAV, even if it is not silence.
I have solved the exact same problem using Adobe Audition (previously 
known as CoolEdit). It supports:

* Initial and trailing silence removal
* Normalization of volume
* Noise removal using fourier transformation (ie. subtract the 
"noiseprint" of room + microphone)
* Batch-runs

I cleaned approx. 60.000 wav files for a digitized voice using Audition. 
It took a couple of days, once I had the procedure down.

As to removing a "click" from the samples, I'm not sure I
recommend an
automated approach, as there's bound to be some samples which get 
truncated this way. As you will have to inspect them anyway after 
automated truncation, you are rpobably better off inspecting the raw 
samples, identifying the ones with clicks visually from the spectrum and 
the removing it manually. Having done that myself I would say it's 
possible to do that at a rate of several hundred samples pr. hour.

I see you are Danish from your email address. If you need further 
assistance, you can contact me and I'll be happy to assist.

Sincerely,
   Anders S. Johansen

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'speex-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Asger Kunuk Alstrup

2004-Aug-06 15:01 UTC

head link

[speex-dev] Removing silence at the start and end of sample encoded

Hi,

Jean-Marc Valin wrote:> Actually, there are probably some batch programs that could do the
> job. It's definitely not a job for speexenc, which I'd like to keep
> simple.
Fair enough. Which batch programs should I be looking for? I had a look around,
and could not find any...

The reason speexenc seems like a good place to do this, is that it already have
the routines to detect silence, and I figured it would be relatively easy to
implement based on that.

Failing to find a suitable batch program, we will probably make a local patch
for this - maybe you can help us a bit by giving a few pointers for how to do
this in speexenc?

Thanks,
Asger
> Le ven 09/01/2004 à 17:37, Asger Kunuk Alstrup a écrit :
>> However, we have a small feature requests for speexenc that would
>> make things perfect for us: Support for trimming the start and end
>> of the sound away, during the encoding process.
>>
>> This feature request consists of two things:
>>
>> 1) support for automatic detection and trimming of silence at the
>> start and end of the sound out.
>>
>> 2) support for intentionally skipping the very first and very last
>> parts of the input WAV, even if it is not silence.
>>
>> This is because we record a lot of small sentences using a recording
>> tool on the PC - we are talking hundreds of sentences. Every time
>> the actor records a sound, he has to press space first on the
>> keyboard, wait a bit, read the text, and then click space again to
>> stop recording when he is done reading it. Now, we can often hear
>> the release of the space key at the start of the sample, and the
>> pressing of space bar at the end of the sample.
>>
>> So, our samples all look like this:
>>
>>  
"Noise---Silence---Noise-speech-with-wrong-gain---Silence---Noise"
>>
>> We would like to automatically get this after feeding the WAVE file
>> to speexenc:
>>
>>   "Speech-without-noise-and-correct-gain"
>>
>> with the leading noise and silence trimmed out. Speexenc already
>> supports automatic gain control and noise reduction, so all we need
>> is the trimming of the start and end, of both the noise and silence.
>> Of course, the silence part varies in length from sample to sample,
>> but we can cap the noise to X milliseconds.
>>
>> As it is today, we have to manually edit these samples to remove the
>> noise, and then Cooledit can batch-process the silence out for us.
>> After that, we use speexenc to get the final sample, but it should
>> be possible to do this in one operation, making everything simpler.
>>
>> If you can not implement this for us, maybe you could give a few
>> hints to where we should look to implement this feature? We are
>> using the CVS version of Speex.
--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'speex-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Reasonably Related Threads

Search for more maybe matching threads

Speex dev - Aug 2004 - Removing silence at the start and end of sample encoded

[speex-dev] Removing silence at the start and end of sample encoded

[speex-dev] Removing silence at the start and end of sample encoded

[speex-dev] Removing silence at the start and end of sample encoded

[speex-dev] Removing silence at the start and end of sample encoded

Reasonably Related Threads