Hi everyone, As you may have guess, Speex 1.2 is slowly approaching, though there's still a lot left to do so I can't say how long it'll take. I thought this was the right time to ask if there's anything missing or that can be improved to make 1.2 better. At this point, it can't be anything major, but there are still some changes that are possible, e.g: - Improving some component that doesn't behave very well. - Improving a confusing API. - Improving robustness of a component to a specific condition - Adding a minor feature ... So what's your favourite "I wish Speex could..." or "Speex sucks because..."? I won't promise I'll take everything into account, but I'll do my best -- if not for 1.2, then maybe for 1.4. Oh, and no I will not make Speex compatible with G.729 :-) Jean-Marc P.S. I finally got around to posting my trivial Speex client that shows how to use Speex with the echo canceller, preprocessor and jitter buffer. It's in svn at http://svn.xiph.org/trunk/speex/speexclient/
It would be nice to merge the Run-Time SSE/Altivec patch that we have been maintaining. Aron Rosenberg SightSpeed Inc. -----Original Message----- From: speex-dev-bounces@xiph.org [mailto:speex-dev-bounces@xiph.org] On Behalf Of Jean-Marc Valin Sent: Monday, November 13, 2006 1:32 AM To: speex-dev Subject: [Speex-dev] Quick survey for Speex 1.2 Hi everyone, As you may have guess, Speex 1.2 is slowly approaching, though there's still a lot left to do so I can't say how long it'll take. I thought this was the right time to ask if there's anything missing or that can be improved to make 1.2 better. At this point, it can't be anything major, but there are still some changes that are possible, e.g: - Improving some component that doesn't behave very well. - Improving a confusing API. - Improving robustness of a component to a specific condition - Adding a minor feature ... So what's your favourite "I wish Speex could..." or "Speex sucks because..."? I won't promise I'll take everything into account, but I'll do my best -- if not for 1.2, then maybe for 1.4. Oh, and no I will not make Speex compatible with G.729 :-) Jean-Marc P.S. I finally got around to posting my trivial Speex client that shows how to use Speex with the echo canceller, preprocessor and jitter buffer. It's in svn at http://svn.xiph.org/trunk/speex/speexclient/ _______________________________________________ Speex-dev mailing list Speex-dev@xiph.org http://lists.xiph.org/mailman/listinfo/speex-dev
On Monday, November 13, 2006, 04:32:28, Jean-Marc Valin wrote:> ... -- if not for 1.2, then maybe for 1.4.Is there some phobia about odd numbers? :-) -- rodd@polylogics.com "The avalanche has already started, it is too Rod Dorman late for the pebbles to vote." - Ambassador Kosh
> On Monday, November 13, 2006, 04:32:28, Jean-Marc Valin wrote: >> ... -- if not for 1.2, then maybe for 1.4. > > Is there some phobia about odd numbers? :-)No, they're used for development versions, just like 1.1 recently became 1.2beta. Jean-Marc
Aron Rosenberg a ?crit :> It would be nice to merge the Run-Time SSE/Altivec patch that we have > been maintaining.As I've mentioned before, this is something that I'd really like to do, but I want to make sure it doesn't make the code harder to maintain. Your previous patch addressed a lot of my earlier issues with the patch, but there are still a few things to do before I can merge it. My goal is that most files will only have to #include the run-time SSE/Altivec header to have that enabled. I don't mind too much if it makes that header ugly as long as the ugliness doesn't spread the the platform-independent part. Jean-Marc> Aron Rosenberg > SightSpeed Inc. > > -----Original Message----- > From: speex-dev-bounces@xiph.org [mailto:speex-dev-bounces@xiph.org] On > Behalf Of Jean-Marc Valin > Sent: Monday, November 13, 2006 1:32 AM > To: speex-dev > Subject: [Speex-dev] Quick survey for Speex 1.2 > > Hi everyone, > > As you may have guess, Speex 1.2 is slowly approaching, though there's > still a lot left to do so I can't say how long it'll take. I thought > this was the right time to ask if there's anything missing or that can > be improved to make 1.2 better. At this point, it can't be anything > major, but there are still some changes that are possible, e.g: > > - Improving some component that doesn't behave very well. > - Improving a confusing API. > - Improving robustness of a component to a specific condition > - Adding a minor feature > ... > > So what's your favourite "I wish Speex could..." or "Speex sucks > because..."? I won't promise I'll take everything into account, but I'll > do my best -- if not for 1.2, then maybe for 1.4. Oh, and no I will not > make Speex compatible with G.729 :-) > > Jean-Marc > > P.S. I finally got around to posting my trivial Speex client that shows > how to use Speex with the echo canceller, preprocessor and jitter > buffer. It's in svn at http://svn.xiph.org/trunk/speex/speexclient/ > _______________________________________________ > Speex-dev mailing list > Speex-dev@xiph.org > http://lists.xiph.org/mailman/listinfo/speex-dev > _______________________________________________ > Speex-dev mailing list > Speex-dev@xiph.org > http://lists.xiph.org/mailman/listinfo/speex-dev > >
On Monday 13 November 2006 10:32, Jean-Marc Valin wrote:> So what's your favourite "I wish Speex could..." or "Speex sucks > because..."?I wish the numerous project files (for visual c++, xcode etc.) that are not so "common" get tested and updated more frequently. They usually don't work out of the box (because of missing files that were added) and it wouldn't be much work to fix these problems if somebody were to regulary try these projectfiles and add files. Peter -- One family builds a wall, two families enjoy it.
> 1) The voice quality of ouput after coding and decoding in WB mode is > unacceptable using the SVN fixed-point code, > while the ouput of floating-point code is good. >...> The test flow is input-->coder---->decoder---->.wav file--->player. > > The quality decrease of the decoder output is notable to ears. > > according to above test, some fixed-point codes must need to be corrected. > > What's your experience with current fixed-point code?Indeed, your file triggers a problem with the fixed-point code. It doesn't happen with all files, so I'm not sure yet if it's a regression (if you've tested with other versions, I'm interested in what you find). I'll be investigating the issue and will fix this as soon as I can. Thanks for reporting the problem.> 2) The rtp profile is not clear > > In the svn directory /doc, draft-herlein-speex-rtp-profile-02.txt is the > latest draft. > > According to the chapter 5, > "The encoding and decoding algorithm can change the bit rate at any 20 > msec frame boundary, with the bit rate change notification provided > in-band with the bit stream. Each frame contains both "mode" > (narrowband, wideband or ultra-wideband) and "sub-mode" (bit-rate) > information in the bit stream. No out-of-band notification is > required for the decoder to process changes in the bit rate sent by > the encoder." > > The "sub-mode" can be calculated from the size of coded frame if "mode" is > known. > But the "mode" is not contained in the coding ouput (note it's not the ogg > case) according to the source code. > How does the decoder know the "mode" from RTP package?(it's assigned > directly in the decoder in current svn sources)The mode/bit-rate is detected automatically by the decoder because it's encoded in the first few bits of each frame. The encoder can change bit-rate at any time without telling the decoder. Can you be more precise as to what isn't clear?> What's the use of file rtp.txt in /doc directory? and its relationship with > draft-herlein-speex-rtp-profile-02.txt?The IETF stuff has been really, really slow. I'm still looking for someone who knows enough to help me finalise all this. Jean-Marc
> 1) The voice quality of ouput after coding and decoding in WB mode is > unacceptable using the SVN fixed-point code, > while the ouput of floating-point code is good.OK, turns out that your file has rather extreme clipping and this causes an overflow. While overflows are unacceptable (and I'm working on fixing that), I strongly recommend you perform your recordings properly and avoid clipping in the input. Jean-Marc
Hi, The win32 stuff is my fault. I've just been busy with exams and moving overseas and getting ready to go on holidays. If you notice they get out of date, just give me a prod by email or irc and i'll try to update them. I just added the missing file about 5 mins ago. I'm going to be busy or on holiday until the start of january, but after that i should hopefully be able to get back to maintaining the win32 build and working more on oggcodecs. Zen. Peter Kirk wrote:> On Monday 13 November 2006 10:32, Jean-Marc Valin wrote: > >> So what's your favourite "I wish Speex could..." or "Speex sucks >> because..."? >> > I wish the numerous project files (for visual c++, xcode etc.) that are not > so "common" get tested and updated more frequently. They usually don't work > out of the box (because of missing files that were added) and it wouldn't be > much work to fix these problems if somebody were to regulary try these > projectfiles and add files. > > Peter >
> The mode/bit-rate is detected automatically by the decoder because it's > encoded in the first few bits of each frame. The encoder can change > bit-rate at any time without telling the decoder. Can you be more > precise as to what isn't clear?I think that one got me too. If mode is detected automatically by the decoder, what is the purpose of the 'mode' parameter in speex_decoder_init function? I'd file it under confusing API. Or am I missing something here (again)? Gregg
Jean-Marc Valin wrote:> Hi everyone, > > As you may have guess, Speex 1.2 is slowly approaching, though there's > still a lot left to do so I can't say how long it'll take. I thought > this was the right time to ask if there's anything missing or that can > be improved to make 1.2 better. At this point, it can't be anything > major, but there are still some changes that are possible, e.g: > > - Improving some component that doesn't behave very well. > - Improving a confusing API. > - Improving robustness of a component to a specific condition > - Adding a minor feature > ... > > So what's your favourite "I wish Speex could..." or "Speex sucks > because..."? I won't promise I'll take everything into account, but I'll > do my best -- if not for 1.2, then maybe for 1.4. Oh, and no I will not > make Speex compatible with G.729 :-) > > Jean-Marc > > P.S. I finally got around to posting my trivial Speex client that shows > how to use Speex with the echo canceller, preprocessor and jitter > buffer. It's in svn at http://svn.xiph.org/trunk/speex/speexclient/ >/* FIXME: This VAD is a kludge */ .. and it shows (or hears?) unfortunately. I've run a few tests with it with my users, and they complain that it misdetects too often... In both directions. Non-speech is detected as speech more often than before, and more important it also doesn't detect speech as good as before. I'd really like to see this "fixed" in some way before 1.2. I used to grab quite a few bits of data directly from the SpeexPreprocessState structure. I have a Audio Statistics window in my application which would show SNR info (Zlast), the speech probability and a graphical view of ps and noise. This really helped users to improve the quality of their input, as they could clearly and immediately see the effect of changes in the environment. These data are now private and can't easily be extracted from an outside program. Would it be possible to add _ctl calls to GET_PS, GET_NOISE, GET_SNR etc? Would you accept patches which did this?
> /* FIXME: This VAD is a kludge */ > .. and it shows (or hears?) unfortunately. I've run a few tests with it > with my users, and they complain that it misdetects too often... In both > directions. Non-speech is detected as speech more often than before, and > more important it also doesn't detect speech as good as before. > I'd really like to see this "fixed" in some way before 1.2.OK, it's good to have that information. I'll try to fix that before 1.2.> I used to grab quite a few bits of data directly from the > SpeexPreprocessState structure. I have a Audio Statistics window in my > application which would show SNR info (Zlast), the speech probability > and a graphical view of ps and noise. This really helped users to > improve the quality of their input, as they could clearly and > immediately see the effect of changes in the environment. > These data are now private and can't easily be extracted from an outside > program. Would it be possible to add _ctl calls to GET_PS, GET_NOISE, > GET_SNR etc? Would you accept patches which did this?One of the main reasons the struct is now private is that its content depends on whether Speex was compiled for fixed-point or floating-point. I'm not against making it possible to extract some information, but it needs to be done in a clean way, that doesn't depend on whether you compiled with float or int. Jean-Marc
Hi All, Another issue is the memory allocations distributed so many places that it's hard to provide a single memory initial function interface. In a VoIP case on ARM, the total memory size for speex codec should be known at the inital stage since all the memories are allocated at the initial stage. In my current implementation, all the memory allocations are collected together to form one big structure like below, typedef struct { EncState enc_state; char stack[NB_ENC_STACK]; spx_word16_t winBuf[80]; spx_word16_t excBuf[612]; spx_word16_t swBuf[612]; spx_word16_t lagWindow[22]; ............................................ int pitch[16]; VBRState vbr; } speex_encoder_memory; The structure defined above is used to allocate all memories after call the initial function. The problem here is the size of the structure which is very large(160k Word32) for wb which is unacceptable for ARM. Any suggestion? Regards, Lianghu On 11/13/06, Jean-Marc Valin <jean-marc.valin@usherbrooke.ca> wrote:> > Hi everyone, > > As you may have guess, Speex 1.2 is slowly approaching, though there's > still a lot left to do so I can't say how long it'll take. I thought > this was the right time to ask if there's anything missing or that can > be improved to make 1.2 better. At this point, it can't be anything > major, but there are still some changes that are possible, e.g: > > - Improving some component that doesn't behave very well. > - Improving a confusing API. > - Improving robustness of a component to a specific condition > - Adding a minor feature > ... > > So what's your favourite "I wish Speex could..." or "Speex sucks > because..."? I won't promise I'll take everything into account, but I'll > do my best -- if not for 1.2, then maybe for 1.4. Oh, and no I will not > make Speex compatible with G.729 :-) > > Jean-Marc > > P.S. I finally got around to posting my trivial Speex client that shows > how to use Speex with the echo canceller, preprocessor and jitter > buffer. It's in svn at http://svn.xiph.org/trunk/speex/speexclient/ > _______________________________________________ > Speex-dev mailing list > Speex-dev@xiph.org > http://lists.xiph.org/mailman/listinfo/speex-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.xiph.org/pipermail/speex-dev/attachments/20061116/e2b5ff0d/attachment.htm
> Another issue is the memory allocations distributed so many places that > it's hard to provide a single memory initial function interface. > > In a VoIP case on ARM, the total memory size for speex codec should be > known at the inital stage since all the memories are allocated > at the initial stage.If you want everything in the same big block, all you need to do is override the speex_alloc() function to give pointers to a big block you've previously allocated.> In my current implementation, all the memory allocations are collected > together to form one big structure like below, > > typedef struct { > > EncState enc_state; > > char stack[NB_ENC_STACK]; > > spx_word16_t winBuf[80]; > > spx_word16_t excBuf[612]; > > spx_word16_t swBuf[612]; > > spx_word16_t lagWindow[22];Yuk. That's so.... G.729 and ITU codecs :-) As much as you'd like something like that, it would be a pain to maintain it. If you want everything in one chunk, just go with the solution above.> The structure defined above is used to allocate all memories after call > the initial function. > > The problem here is the size of the structure which is very large(160k > Word32) for wb which is unacceptable for ARM.That's a totally different topic. I do intend to reduce the wb memory usage, just like I did with the narrowband for 1.2beta1. Still, don't know where you take this 160k Word32 number (640 kB). I don't think wideband requires anywhere near that amount of memory. Jean-Marc> Any suggestion? > > Regards, > > Lianghu > > > On 11/13/06, *Jean-Marc Valin* <jean-marc.valin@usherbrooke.ca > <mailto:jean-marc.valin@usherbrooke.ca>> wrote: > > Hi everyone, > > As you may have guess, Speex 1.2 is slowly approaching, though there's > still a lot left to do so I can't say how long it'll take. I thought > this was the right time to ask if there's anything missing or that can > be improved to make 1.2 better. At this point, it can't be anything > major, but there are still some changes that are possible, e.g: > > - Improving some component that doesn't behave very well. > - Improving a confusing API. > - Improving robustness of a component to a specific condition > - Adding a minor feature > ... > > So what's your favourite "I wish Speex could..." or "Speex sucks > because..."? I won't promise I'll take everything into account, but I'll > do my best -- if not for 1.2, then maybe for 1.4. Oh, and no I will not > make Speex compatible with G.729 :-) > > Jean-Marc > > P.S. I finally got around to posting my trivial Speex client that shows > how to use Speex with the echo canceller, preprocessor and jitter > buffer. It's in svn at http://svn.xiph.org/trunk/speex/speexclient/ > _______________________________________________ > Speex-dev mailing list > Speex-dev@xiph.org <mailto:Speex-dev@xiph.org> > http://lists.xiph.org/mailman/listinfo/speex-dev > >