thr3ads.net - opus - [opus] Optimizing on AMD Geode (MMX, no SSE) [Jan 2015]

If this information is useful, please help other people find it:
Share via:

Matteo Fortini

2015-Jan-07 16:01 UTC

[opus] Optimizing on AMD Geode (MMX, no SSE)

I'm trying to improve Opus on an AMD Geode CPU, which has limited SSE 
support (called 3DNow!), but MMX.

Without optimizations I can only encode 16 bit audio @16KHz with 
complexity up to 2-3 without underruns.

I tried compiling with SSE2/4 optimizations, but all I got was a crash 
with SIGILL, so I looked into optimized code and found that a good 
starting point was the dot product, so I inserted an MMX implementation 
of it, gaining a bit in performance.

Then I saw the xcorr function in its simplest form, which is looping and 
calculating dot products, and substituted the dot product with a call to 
the MMX version. This way I can go up to complexity 3-4 without underruns.

Since this is far from optimal, I was looking into other places that 
would get big benefits from parallelization.

Can you point out some? I was thinking about the FIR/IIR filter 
implementations, but I'm afraid the overhead of using MMX would offset 
the gain, since the filter is probably not so long.

Of course I can share the MMX code, even if it's still not cleanly 
incorporated in the source.

Thank you in advance,
Matteo

Timothy B. Terriberry

2015-Jan-07 16:25 UTC

head link

[opus] Optimizing on AMD Geode (MMX, no SSE)

Matteo Fortini wrote:> Since this is far from optimal, I was looking into other places that
> would get big benefits from parallelization.
The best answer, of course, is to profile the code. I'm assuming you're 
primarily targeting VoIP (and thus SILK at those settings). The 
fixed-point x86 optimizations are reasonably good choices, but were 
targeted at complexity 2, so their relative importance will change a bit 
as you move to higher complexities. I.e., once you reach complexity 4 
warped autocorrelation starts to get used, as well as NSQ_del_dec over 
plain NSQ.

Maybe Matching Threads

Search for more maybe matching threads

opus - Jan 2015 - Optimizing on AMD Geode (MMX, no SSE)

[opus] Optimizing on AMD Geode (MMX, no SSE)

[opus] Optimizing on AMD Geode (MMX, no SSE)

Maybe Matching Threads