The reasons I have posted these questions are: 1) To find out if Speex can take advantage of SIMD extensions. 2) To maybe learn from someone with previous experience in optimizing Speex for moderns x86 architectures before I set off trying all kinds of things on my own. See answers inline: 2009/6/15 Tom Grandgent <tom at grandgent.com>> Why haven't you tried using release build with compiler optimizations?I just haven't started with optimizing... yet.> > It's quite possible that the performance picture could be substantially > different. You might end up wasting a lot of time if you do much > performance analysis or optimization on a debug build.Yes, you are right and that is not what I'm doing.> Debug build > not only has no optimization - it also has extra checks that may have a > significant performance impact depending on the code.Possibly, but I have identified the most expensive functions. They are all from the Speex dll. I believe these will remain the most interesting ones also in release build with some O-flags. Someone who is knowledgeable in these functions might know if their impact can be reduced and what the best practices are.> > > If you want to profile with symbols, you know you can compile a release > build with symbols, right? The CodeAnalyst documentation describes > how to do that with Visual Studio. (I've done it.)Ok. thx.> > > Tom > > Greger Burman <greger at mobile-robotics.com> wrote: > > > > I have a question about the overall performance of Speex and what I can > do > > to improve it. I'm running Speex Windows x86, Visual C++ EE compiler. I > will > > say right away that I've only compiled debug so far and used no compiler > > optimizations at all. > > I use the uwb-mode, preprocessing, denoising and echo cancellation. > > I've noticed that speex consumes a lot of cpu resources. When I run this > on > > a Celeron 2,6GHz I have to disable EC in order to not overload the cpu. > Am I > > correct to assume that there are massive floating point calculations > > happening? > > I did a quick profile with CodeAnalyst and identified the most expensive > > functions as (in order): > > CPU Clocks, Function > > 4657, kiss_fft_stride > > 4456, speex_echo_cancellation > > 2494, split_cb_search_shape_sign > > 1490, fir_mem16 > > 1419, speex_preprocess_run > > I'm looking for advise on how to boost the performance with as little > code > > rewrite as possible. The architecture for release build will be SSE/SSE2 > > capable. > > 1) Compiler optimizations: Recommended options? > > 2) SIMD. Is Speex written to take advantage of SIMD architectures? What > must > > I do to take advantage of this? > > -- > > Greger Burman > >-------------- next part -------------- An HTML attachment was scrubbed... URL: lists.xiph.org/pipermail/speex-dev/attachments/20090618/b1bd043e/attachment.htm
Quoting Greger Burman <greger at mobile-robotics.com>:> The reasons I have posted these questions are: 1) To find out if Speex can > take advantage of SIMD extensions.If you define _USE_SSE, CELT is already able to use SSE instructions. You must be careful on Windows though because it doesn't support C99 var-arrays and alloca() isn't properly aligned for SSE, so you have to make sure that alloca() isn't used.> 2) To maybe learn from someone with previous experience in optimizing Speex > for moderns x86 architectures before I set off trying all kinds of things on > my own.Can't think of anything else you need to know. Jean-Marc> > See answers inline: > 2009/6/15 Tom Grandgent <tom at grandgent.com> > > > Why haven't you tried using release build with compiler optimizations? > > I just haven't started with optimizing... yet. > > > > > > It's quite possible that the performance picture could be substantially > > different. You might end up wasting a lot of time if you do much > > performance analysis or optimization on a debug build. > > Yes, you are right and that is not what I'm doing. > > > > Debug build > > not only has no optimization - it also has extra checks that may have a > > significant performance impact depending on the code. > > Possibly, but I have identified the most expensive functions. They are all > from the Speex dll. I believe these will remain the most interesting ones > also in release build with some O-flags. > Someone who is knowledgeable in these functions might know if their impact > can be reduced and what the best practices are. > > > > > > > If you want to profile with symbols, you know you can compile a release > > build with symbols, right? The CodeAnalyst documentation describes > > how to do that with Visual Studio. (I've done it.) > > Ok. thx. > > > > > > > Tom > > > > Greger Burman <greger at mobile-robotics.com> wrote: > > > > > > I have a question about the overall performance of Speex and what I can > > do > > > to improve it. I'm running Speex Windows x86, Visual C++ EE compiler. I > > will > > > say right away that I've only compiled debug so far and used no compiler > > > optimizations at all. > > > I use the uwb-mode, preprocessing, denoising and echo cancellation. > > > I've noticed that speex consumes a lot of cpu resources. When I run this > > on > > > a Celeron 2,6GHz I have to disable EC in order to not overload the cpu. > > Am I > > > correct to assume that there are massive floating point calculations > > > happening? > > > I did a quick profile with CodeAnalyst and identified the most expensive > > > functions as (in order): > > > CPU Clocks, Function > > > 4657, kiss_fft_stride > > > 4456, speex_echo_cancellation > > > 2494, split_cb_search_shape_sign > > > 1490, fir_mem16 > > > 1419, speex_preprocess_run > > > I'm looking for advise on how to boost the performance with as little > > code > > > rewrite as possible. The architecture for release build will be SSE/SSE2 > > > capable. > > > 1) Compiler optimizations: Recommended options? > > > 2) SIMD. Is Speex written to take advantage of SIMD architectures? What > > must > > > I do to take advantage of this? > > > -- > > > Greger Burman > > > > >
Ok. Will look out for alloca(). cheers If you define _USE_SSE, CELT is already able to use SSE instructions. You> must > be careful on Windows though because it doesn't support C99 var-arrays and > alloca() isn't properly aligned for SSE, so you have to make sure that > alloca() > isn't used. > > > 2) To maybe learn from someone with previous experience in optimizing > Speex > > for moderns x86 architectures before I set off trying all kinds of things > on > > my own. > > Can't think of anything else you need to know. > > Jean-Marc > > > > > See answers inline: > > 2009/6/15 Tom Grandgent <tom at grandgent.com> > > > > > Why haven't you tried using release build with compiler optimizations? > > > > I just haven't started with optimizing... yet. > > > > > > > > > > It's quite possible that the performance picture could be substantially > > > different. You might end up wasting a lot of time if you do much > > > performance analysis or optimization on a debug build. > > > > Yes, you are right and that is not what I'm doing. > > > > > > > Debug build > > > not only has no optimization - it also has extra checks that may have a > > > significant performance impact depending on the code. > > > > Possibly, but I have identified the most expensive functions. They are > all > > from the Speex dll. I believe these will remain the most interesting ones > > also in release build with some O-flags. > > Someone who is knowledgeable in these functions might know if their > impact > > can be reduced and what the best practices are. > > > > > > > > > > > If you want to profile with symbols, you know you can compile a release > > > build with symbols, right? The CodeAnalyst documentation describes > > > how to do that with Visual Studio. (I've done it.) > > > > Ok. thx. > > > > > > > > > > > Tom > > > > > > Greger Burman <greger at mobile-robotics.com> wrote: > > > > > > > > I have a question about the overall performance of Speex and what I > can > > > do > > > > to improve it. I'm running Speex Windows x86, Visual C++ EE compiler. > I > > > will > > > > say right away that I've only compiled debug so far and used no > compiler > > > > optimizations at all. > > > > I use the uwb-mode, preprocessing, denoising and echo cancellation. > > > > I've noticed that speex consumes a lot of cpu resources. When I run > this > > > on > > > > a Celeron 2,6GHz I have to disable EC in order to not overload the > cpu. > > > Am I > > > > correct to assume that there are massive floating point calculations > > > > happening? > > > > I did a quick profile with CodeAnalyst and identified the most > expensive > > > > functions as (in order): > > > > CPU Clocks, Function > > > > 4657, kiss_fft_stride > > > > 4456, speex_echo_cancellation > > > > 2494, split_cb_search_shape_sign > > > > 1490, fir_mem16 > > > > 1419, speex_preprocess_run > > > > I'm looking for advise on how to boost the performance with as little > > > code > > > > rewrite as possible. The architecture for release build will be > SSE/SSE2 > > > > capable. > > > > 1) Compiler optimizations: Recommended options? > > > > 2) SIMD. Is Speex written to take advantage of SIMD architectures? > What > > > must > > > > I do to take advantage of this? > > > > -- > > > > Greger Burman > > > > > > > > > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: lists.xiph.org/pipermail/speex-dev/attachments/20090618/b4beb6f7/attachment.htm