M. LALMI ALL-RTP
2019-Sep-05 12:55 UTC
[opus] Opus VAD in 1.3 (and Music/Speech detection)
Hello, I am studying different VAD (and Speech/Music detection) methods and find the one based on GRU very interesting (the one implemented in Opus 1.3). Is there a documentation on how to calculate the vector of input features [25 elements] and a description on how the GRU was trained (RFC, Presentation, ...etc.)? (I am not able to understand all the content of the source code in analysis.c ) What happens if audio frames contain both speech and music (like in waiting music of call centers) ? will it detect speech or music ? Thanks in advance for your help, Best regards, Mohamed