thr3ads.net - Speex dev - [speex-dev] Memory leak in denoiser + a few questions [Aug 2004]

If this information is useful, please help other people find it:
Share via:

Steve Kann

2004-Aug-06 15:02 UTC

[speex-dev] Memory leak in denoiser + a few questions

Jean-Marc Valin wrote:
>>Reverberation suppression?
>>    
>>
>
>Basically, it means that if you are in a room with lots of echo (long
>decay), I can reduce it a bit.
>
>  
>
>>I guess this would help reduce local source echoes?  I've never 
>>_noticed_ that to be a problem in my use, but I would imagine that 
>>using a notebook's built-in microphone, you'd get some echo off
of the
>>screen and stuff [also from the whole room]..
>>
>>Most of these echoes aren't so bad, but I guess they might make the 
>>encoding job harder.  I'd sure rather see the echo cancellation 
>>finished [not that I have any say on what you work on!!!].
>>    
>>
>
>Well, I'm still looking for help :)
>
>  
>
>>Here's the numbers I got doing vad on 655 seconds of audio (about
half
>>is speech, half is absolute silence [0's]).
>>P3-600: 25 seconds
>>Athlon XP 1700+ (1.45Ghz): 5 seconds
>>P4 2.8Ghz: 8.8 seconds.
>>    
>>
>
>These numbers sound like a problem I has a while ago with the decoder.
>The VAD shouldn't take much CPU so I suspect there might be floating
>point underflows in some part, slowing down the Intel CPUs a lot (for
>some reason, the AMD CPUs seem to handle underflows faster).
>  
>
Hmm, How can I find that out?  How much CPU would you expect it to take?

I've been playing with oprofile, but I don't see it getting that finely 
grained..
>>Anyway, I think I might need to find a less computationally intensive 
>>VAD solution for the conference.  VAD is currently only used when 
>>people connect via the PSTN, so they presumably have a decent SNR, and 
>>I may be able to get away with an energy envelope type of thing, 
>>without needing frequency domain analysis.  But before I go and start 
>>coding this, is there any simple optimizations that can be done to the 
>>preprocessor when it is  being used only for the VAD decision?
>>    
>>
>
>Have you tried using the (less accurate) VAD that's in the codec itself
>(SPEEX_SET_VAD)?
>  
>I'll take a look at that.  In this case [in the conferencing 
application], I'm not actually using speex encoding [these are PSTN 
callers, I do VAD in clients when I control them], so I'd need to see if 
I could rip it out of speex to use it.

Also, I do have a couple of patches to the preprocessor to send along 
actually; basically this makes the start and continue probabilities 
parameters that can be set by callers.  We're currently using very low 
probabilities;   Much lower than your defaults, VAD_START=0.05 
VAD_CONTINUE=0.02.  We also have 20 frame (2/5 sec) "tail" that is 
outside the preprocessor, which continues treating some frames as speech 
after the detector has dropped out.

<p>Here's a patch:

<p>=================================================
<p>Diff for file preprocess.c, 1.2 -> 1.3
Index: preprocess.c
==================================================================RCS file:
/home/UniServ/dls/CVS/hms/app_conference/libspeex/preprocess.c,v
retrieving revision 1.2
retrieving revision 1.3
diff -u -w -r1.2 -r1.3
--- preprocess.c	2003/11/07 23:40:23	1.2
+++ preprocess.c	2004/02/06 17:10:24	1.3
@@ -145,6 +145,9 @@
    st->agc_level = 8000;
    st->vad_enabled = 0;
 
+   st->speech_prob_start = SPEEX_PROB_START ;
+   st->speech_prob_continue = SPEEX_PROB_CONTINUE ;
+   
    st->frame = (float*)speex_alloc(2*N*sizeof(float));
    st->ps = (float*)speex_alloc(N*sizeof(float));
    st->gain2 = (float*)speex_alloc(N*sizeof(float));
@@ -435,12 +438,19 @@
       st->speech_prob = p0/(1e-25+p1+p0);
       /*fprintf (stderr, "%f %f %f ", tot_loudness, st->loudness2,
st->speech_prob);*/
 
+	/* decide if frame is speech using speech probability settings */
+
 /*      if (st->speech_prob> .35 || (st->last_speech < 20
&& st->speech_prob>.1)) */
-      if (st->speech_prob> .20 || (st->last_speech < 20 &&
st->speech_prob>.05))
+	if (
+		st->speech_prob > st->speech_prob_start
+		|| ( st->last_speech < 20 && st->speech_prob >
st->speech_prob_continue )
+	)
       {
          is_speech = 1;
          st->last_speech = 0;
-      } else {
+	} 
+	else 
+	{
          st->last_speech++;
          if (st->last_speech<20)
            is_speech = 1;
@@ -985,6 +995,30 @@
    case SPEEX_PREPROCESS_GET_VAD:
       (*(int*)ptr) = st->vad_enabled;
       break;
+      
+	case SPEEX_PREPROCESS_SET_PROB_START:
+		st->speech_prob_start = (*(float*)ptr) ;
+		if ( st->speech_prob_start > 1 )
+			st->speech_prob_start = st->speech_prob_start / 100 ;
+		if ( st->speech_prob_start > 1 || st->speech_prob_start < 0 )
+			st->speech_prob_start = SPEEX_PROB_START ;
+		break ;
+	case SPEEX_PREPROCESS_GET_PROB_START:
+		(*(float*)ptr) = st->speech_prob_start ;
+		break ;
+      
+	case SPEEX_PREPROCESS_SET_PROB_CONTINUE:
+		st->speech_prob_continue = (*(float*)ptr) ;
+		if ( st->speech_prob_continue > 1 )
+			st->speech_prob_continue = st->speech_prob_continue / 100 ;
+		if ( st->speech_prob_continue > 1 || st->speech_prob_continue < 0
)
+			st->speech_prob_continue = SPEEX_PROB_CONTINUE ;
+		break ;
+		break ;
+	case SPEEX_PREPROCESS_GET_PROB_CONTINUE:
+		(*(float*)ptr) = st->speech_prob_continue ;
+		break ;
+      
    default:
       speex_warning_int("Unknown speex_preprocess_ctl request: ",
request);
       return -1;

Diff for file speex_preprocess.h, 1.1 -> 1.2
Index: speex_preprocess.h
==================================================================RCS file:
/home/UniServ/dls/CVS/hms/app_conference/libspeex/speex_preprocess.h,v
retrieving revision 1.1
retrieving revision 1.2
diff -u -w -r1.1 -r1.2
--- speex_preprocess.h	2003/11/06 21:57:59	1.1
+++ speex_preprocess.h	2004/02/06 17:10:24	1.2
@@ -49,6 +49,10 @@
    float  agc_level;
    int    vad_enabled;
 
+	// probabilities to check speech_prob against
+	float speech_prob_start ;
+	float speech_prob_continue ;
+
    float *frame;             /**< Processing frame (2*ps_size) */
    float *ps;                /**< Current power spectrum */
    float *gain2;             /**< Adjusted gains */
@@ -108,8 +112,9 @@
 
 /** Used like the ioctl function to control the preprocessor parameters */
 int speex_preprocess_ctl(SpeexPreprocessState *st, int request, void *ptr);
-
 
+#define SPEEX_PROB_START 0.35 
+#define SPEEX_PROB_CONTINUE 0.1
 
 #define SPEEX_PREPROCESS_SET_DENOISE 0
 #define SPEEX_PREPROCESS_GET_DENOISE 1
@@ -122,6 +127,12 @@
 
 #define SPEEX_PREPROCESS_SET_AGC_LEVEL 6
 #define SPEEX_PREPROCESS_GET_AGC_LEVEL 7
+
+#define SPEEX_PREPROCESS_SET_PROB_START 8
+#define SPEEX_PREPROCESS_GET_PROB_START 9
+
+#define SPEEX_PREPROCESS_SET_PROB_CONTINUE 10
+#define SPEEX_PREPROCESS_GET_PROB_CONTINUE 11
 
 #ifdef __cplusplus

================================================= }

<p><p><p><p><p><p>>
Jean-Marc>
>  
>
<p>--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'speex-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Jean-Marc Valin

2004-Aug-06 15:02 UTC

head link

[speex-dev] Memory leak in denoiser + a few questions

> Hmm, How can I find that out?  How much CPU would you expect it to
> take?
I don't know. It's been a while since I last played with that code, but
I'd expect it to take less time. 
> I've been playing with oprofile, but I don't see it getting that
> finely grained..
Can you make sure the time is spent in the VAD and not in the encoder or
decoder (at the other end) when the VAD is on (the underflow problem I
had appeared with VBR, but the problem was in the decoder).
> I'll take a look at that.  In this case [in the conferencing
> application], I'm not actually using speex encoding [these are PSTN
> callers, I do VAD in clients when I control them], so I'd need to see
> if I could rip it out of speex to use it.
Don't waste too much time, though. That VAD is really basic.
> Also, I do have a couple of patches to the preprocessor to send along
> actually; basically this makes the start and continue probabilities
> parameters that can be set by callers.  We're currently using very low
> probabilities;   Much lower than your defaults, VAD_START=0.05
> VAD_CONTINUE=0.02.  We also have 20 frame (2/5 sec) "tail" that
is
> outside the preprocessor, which continues treating some frames as
> speech after the detector has dropped out.
That's the same patch you sent a while ago, right? Sorry, I haven't had
much time for Speex lately.

        Jean-Marc


-- 
Jean-Marc Valin
http://www.xiph.org/~jm/
LABORIUS
Université de Sherbrooke, Québec, Canada


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 190 bytes
Desc: Ceci est une partie de message numériquement signée.
Url :
http://lists.xiph.org/pipermail/speex-dev/attachments/20040330/829fc57b/signature-0001.pgp

Steve Kann

2004-Aug-06 15:02 UTC

head link

[speex-dev] Memory leak in denoiser + a few questions

Jean-Marc Valin wrote:
>>Hmm, How can I find that out?  How much CPU would you expect it to
>>take?
>>    
>>
>
>I don't know. It's been a while since I last played with that code,
but
>I'd expect it to take less time. 
>
>  
>
>>I've been playing with oprofile, but I don't see it getting that
>>finely grained..
>>    
>>
>
>Can you make sure the time is spent in the VAD and not in the encoder or
>decoder (at the other end) when the VAD is on (the underflow problem I
>had appeared with VBR, but the problem was in the decoder).
>  
>The tests I did to determine VAD CPU usage were pretty basic, and I'm 
sure it was just VAD being used.  I made a small test program which 
reads a bunch of audio data into a buffer, and then does:

 /* speex implementation */
    {
        SpeexPreprocessState *dsp = speex_preprocess_state_init( 
AST_CONF_BLOCK_SAMPLES, AST_CONF_SAMPLE_RATE ) ;
        int set;
        set = 1;
        speex_preprocess_ctl( dsp, SPEEX_PREPROCESS_SET_VAD, &set ) ;
        set = 0;
        speex_preprocess_ctl( dsp, SPEEX_PREPROCESS_SET_DENOISE, &set ) ;
        speex_preprocess_ctl( dsp, SPEEX_PREPROCESS_SET_AGC, &set ) ;
                                                                                

        while(reps-- > 0) {
          int i;
                                                                                

          printf("beginning pass\n");
                                                                                

          for(i=0;i<BUFSAMP;i+=AST_CONF_BLOCK_SAMPLES) {
              speex_preprocess(dsp, audbuf+i, NULL);
          }
        
}                                                                    
    }

The sample data I used was 1024x1024 samples, and I went through it 5 
times (nreps = 5, BUFSAMP = 1024*1024).  The data in the buffer is 
537760 samples of speech, with the rest being zeros.
>>I'll take a look at that.  In this case [in the conferencing
>>application], I'm not actually using speex encoding [these are PSTN
>>callers, I do VAD in clients when I control them], so I'd need to
see
>>if I could rip it out of speex to use it.
>>    
>>
>
>Don't waste too much time, though. That VAD is really basic.
>  
>I might be able to get by, in this application, with something more 
basic, though.  In my limited testing, I seem to remember getting SNR 
from PSTN clients which was _much_ better than that from microphones on 
PCs. 

I'd like to be able to handle sume number of hundreds of calls in the 
conference, with up to maybe 100 of them being processed by the VAD.  
Right now, the VAD is the dominant part of the conference [encoding and 
decoding are actually smaller, because the channels which do 
encoding/decoding do VAD on the client end, and the channels that do 
need VAD are ulaw encoded.

<p>>>Also, I do have a couple of patches to the preprocessor to send
along>>actually; basically this makes the start and continue probabilities
>>parameters that can be set by callers.  We're currently using very
low
>>probabilities;   Much lower than your defaults, VAD_START=0.05
>>VAD_CONTINUE=0.02.  We also have 20 frame (2/5 sec) "tail"
that is
>>outside the preprocessor, which continues treating some frames as
>>speech after the detector has dropped out.
>>    
>>
>
>That's the same patch you sent a while ago, right? Sorry, I haven't
had
>much time for Speex lately.
>  
>I already sent that?  I forgot :)  We're all busy!

<p><p>--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'speex-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Gustavo García Bernardo

2004-Aug-06 15:02 UTC

head link

[speex-dev] Echo cancel

Hi,

I've been testing speex echo cancellation , but i only obtain the echo
amplified :-(

I'm using a reference signal ref(n)=signal1(n), and an echo signal
echo(n)=signal2(n)+0.2*signal1(n-10).

Somebody is using it? How can i test it?

Thank you very much.

G.

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to
'speex-dev-request@xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is
needed.
Unsubscribe messages sent to the list will be ignored/filtered.

Jean-Marc Valin

2004-Aug-06 15:02 UTC

head link

[speex-dev] Memory leak in denoiser + a few questions

> > These numbers sound like a problem I has a while ago with the decoder.
> > The VAD shouldn't take much CPU so I suspect there might be
floating
> > point underflows in some part, slowing down the Intel CPUs a lot (for
> > some reason, the AMD CPUs seem to handle underflows faster).
> >   
> 
> Hmm, How can I find that out?  How much CPU would you expect it to
> take?
I just did a quick check and you shouldn't even notice the amount of CPU
the VAD takes. The problem is most likely caused by floating point
exceptions of some sort, either NaN's, denorms, or underflows. I've
never encountered the problem, so it's likely specific to the kind of
data you process. Can you try pinpointing the problem a bit and then
send me a sample file?

        Jean-Marc


-- 
Jean-Marc Valin
http://www.xiph.org/~jm/
LABORIUS
Université de Sherbrooke, Québec, Canada


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 190 bytes
Desc: Ceci est une partie de message numériquement signée.
Url :
http://lists.xiph.org/pipermail/speex-dev/attachments/20040330/7e9bf1db/signature-0001.pgp

Maybe Matching Threads

Search for more maybe matching threads

Speex dev - Aug 2004 - Memory leak in denoiser + a few questions

[speex-dev] Memory leak in denoiser + a few questions

[speex-dev] Memory leak in denoiser + a few questions

[speex-dev] Memory leak in denoiser + a few questions

[speex-dev] Echo cancel

[speex-dev] Memory leak in denoiser + a few questions

Maybe Matching Threads