Asger Kunuk Alstrup
2004-Aug-06 15:01 UTC
[speex-dev] Removing silence at the start and end of sample encoded
Hi, Speex is great! We are using it to compress hundreds of megabytes of speech for use in our application that trains people in resuscitation. The previous version of our product used Ogg Vorbis, but after switching to Speex, we achieve fantastic compression, while retaining super quality. That allows us to cram more translated versions of the software onto each CD-ROM, making everything better. However, we have a small feature requests for speexenc that would make things perfect for us: Support for trimming the start and end of the sound away, during the encoding process. This feature request consists of two things: 1) support for automatic detection and trimming of silence at the start and end of the sound out. 2) support for intentionally skipping the very first and very last parts of the input WAV, even if it is not silence. This is because we record a lot of small sentences using a recording tool on the PC - we are talking hundreds of sentences. Every time the actor records a sound, he has to press space first on the keyboard, wait a bit, read the text, and then click space again to stop recording when he is done reading it. Now, we can often hear the release of the space key at the start of the sample, and the pressing of space bar at the end of the sample. So, our samples all look like this: "Noise---Silence---Noise-speech-with-wrong-gain---Silence---Noise" We would like to automatically get this after feeding the WAVE file to speexenc: "Speech-without-noise-and-correct-gain" with the leading noise and silence trimmed out. Speexenc already supports automatic gain control and noise reduction, so all we need is the trimming of the start and end, of both the noise and silence. Of course, the silence part varies in length from sample to sample, but we can cap the noise to X milliseconds. As it is today, we have to manually edit these samples to remove the noise, and then Cooledit can batch-process the silence out for us. After that, we use speexenc to get the final sample, but it should be possible to do this in one operation, making everything simpler. If you can not implement this for us, maybe you could give a few hints to where we should look to implement this feature? We are using the CVS version of Speex. Thanks in advance, Asger Ottar Alstrup --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'speex-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
Jean-Marc Valin
2004-Aug-06 15:01 UTC
[speex-dev] Removing silence at the start and end of sample encoded
Actually, there are probably some batch programs that could do the job. It's definitely not a job for speexenc, which I'd like to keep simple. Jean-Marc Le ven 09/01/2004 à 17:37, Asger Kunuk Alstrup a écrit :> Hi, > > Speex is great! > > We are using it to compress hundreds of megabytes of speech for use in our > application that trains people in resuscitation. The previous version of our > product used Ogg Vorbis, but after switching to Speex, we achieve fantastic > compression, while retaining super quality. That allows us to cram more > translated versions of the software onto each CD-ROM, making everything better. > > However, we have a small feature requests for speexenc that would make things > perfect for us: Support for trimming the start and end of the sound away, during > the encoding process. > > This feature request consists of two things: > > 1) support for automatic detection and trimming of silence at the start and end > of the sound out. > > 2) support for intentionally skipping the very first and very last parts of the > input WAV, even if it is not silence. > > This is because we record a lot of small sentences using a recording tool on the > PC - we are talking hundreds of sentences. Every time the actor records a sound, > he has to press space first on the keyboard, wait a bit, read the text, and then > click space again to stop recording when he is done reading it. Now, we can > often hear the release of the space key at the start of the sample, and the > pressing of space bar at the end of the sample. > > So, our samples all look like this: > > "Noise---Silence---Noise-speech-with-wrong-gain---Silence---Noise" > > We would like to automatically get this after feeding the WAVE file to speexenc: > > "Speech-without-noise-and-correct-gain" > > with the leading noise and silence trimmed out. Speexenc already supports > automatic gain control and noise reduction, so all we need is the trimming of > the start and end, of both the noise and silence. Of course, the silence part > varies in length from sample to sample, but we can cap the noise to X > milliseconds. > > As it is today, we have to manually edit these samples to remove the noise, and > then Cooledit can batch-process the silence out for us. After that, we use > speexenc to get the final sample, but it should be possible to do this in one > operation, making everything simpler. > > If you can not implement this for us, maybe you could give a few hints to where > we should look to implement this feature? We are using the CVS version of Speex. > > Thanks in advance, > > Asger Ottar Alstrup > > --- >8 ---- > List archives: http://www.xiph.org/archives/ > Ogg project homepage: http://www.xiph.org/ogg/ > To unsubscribe from this list, send a message to 'speex-dev-request@xiph.org' > containing only the word 'unsubscribe' in the body. No subject is needed. > Unsubscribe messages sent to the list will be ignored/filtered.-- Jean-Marc Valin, M.Sc.A., ing. jr. LABORIUS (http://www.gel.usherb.ca/laborius) Université de Sherbrooke, Québec, Canada -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 190 bytes Desc: Ceci est une partie de message numériquement signée. Url : http://lists.xiph.org/pipermail/speex-dev/attachments/20040110/2244d57e/signature-0001.pgp
Anders S. Johansen
2004-Aug-06 15:01 UTC
[speex-dev] Removing silence at the start and end of sample encoded
Asger Kunuk Alstrup wrote:> 1) support for automatic detection and trimming of silence at the start and end > of the sound out. > > 2) support for intentionally skipping the very first and very last parts of the > input WAV, even if it is not silence.I have solved the exact same problem using Adobe Audition (previously known as CoolEdit). It supports: * Initial and trailing silence removal * Normalization of volume * Noise removal using fourier transformation (ie. subtract the "noiseprint" of room + microphone) * Batch-runs I cleaned approx. 60.000 wav files for a digitized voice using Audition. It took a couple of days, once I had the procedure down. As to removing a "click" from the samples, I'm not sure I recommend an automated approach, as there's bound to be some samples which get truncated this way. As you will have to inspect them anyway after automated truncation, you are rpobably better off inspecting the raw samples, identifying the ones with clicks visually from the spectrum and the removing it manually. Having done that myself I would say it's possible to do that at a rate of several hundred samples pr. hour. I see you are Danish from your email address. If you need further assistance, you can contact me and I'll be happy to assist. Sincerely, Anders S. Johansen --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'speex-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
Asger Kunuk Alstrup
2004-Aug-06 15:01 UTC
[speex-dev] Removing silence at the start and end of sample encoded
Hi, Jean-Marc Valin wrote:> Actually, there are probably some batch programs that could do the > job. It's definitely not a job for speexenc, which I'd like to keep > simple.Fair enough. Which batch programs should I be looking for? I had a look around, and could not find any... The reason speexenc seems like a good place to do this, is that it already have the routines to detect silence, and I figured it would be relatively easy to implement based on that. Failing to find a suitable batch program, we will probably make a local patch for this - maybe you can help us a bit by giving a few pointers for how to do this in speexenc? Thanks, Asger> Le ven 09/01/2004 à 17:37, Asger Kunuk Alstrup a écrit : >> However, we have a small feature requests for speexenc that would >> make things perfect for us: Support for trimming the start and end >> of the sound away, during the encoding process. >> >> This feature request consists of two things: >> >> 1) support for automatic detection and trimming of silence at the >> start and end of the sound out. >> >> 2) support for intentionally skipping the very first and very last >> parts of the input WAV, even if it is not silence. >> >> This is because we record a lot of small sentences using a recording >> tool on the PC - we are talking hundreds of sentences. Every time >> the actor records a sound, he has to press space first on the >> keyboard, wait a bit, read the text, and then click space again to >> stop recording when he is done reading it. Now, we can often hear >> the release of the space key at the start of the sample, and the >> pressing of space bar at the end of the sample. >> >> So, our samples all look like this: >> >> "Noise---Silence---Noise-speech-with-wrong-gain---Silence---Noise" >> >> We would like to automatically get this after feeding the WAVE file >> to speexenc: >> >> "Speech-without-noise-and-correct-gain" >> >> with the leading noise and silence trimmed out. Speexenc already >> supports automatic gain control and noise reduction, so all we need >> is the trimming of the start and end, of both the noise and silence. >> Of course, the silence part varies in length from sample to sample, >> but we can cap the noise to X milliseconds. >> >> As it is today, we have to manually edit these samples to remove the >> noise, and then Cooledit can batch-process the silence out for us. >> After that, we use speexenc to get the final sample, but it should >> be possible to do this in one operation, making everything simpler. >> >> If you can not implement this for us, maybe you could give a few >> hints to where we should look to implement this feature? We are >> using the CVS version of Speex.--- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'speex-dev-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
Seemingly Similar Threads
- Removing silence at the start and end of sample encoded
- Removing silence at the start and end of sample encoded
- [LLVMdev] First proof of concept of a LLVM driven backend for the neko virtual machine
- Problem encoding sine wave in 1.1.6 and somewhat in 1.0.4
- Intermittent Silence