Displaying 14 results from an estimated 14 matches for "silk_lpc_analysis_filter".
2017 Feb 15
4
[PATCH] Refactor silk_LPC_analysis_filter() & Optimize celt_fir_permit_overflow() for ARM NEON
Hi,
Attached are two patches. Patch 1 refactors silk_LPC_analysis_filter(). And
Patch 2 optimizes the new function celt_fir_permit_overflow() for ARM NEON.
Please recommend a better function name.
We did the same internal code review and testing already.
Thanks,
Linfeng
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.x...
2017 Feb 15
2
[PATCH] Refactor silk_LPC_analysis_filter() & Optimize celt_fir_permit_overflow() for ARM NEON
...mit_overflow() which is the
SMALL_FOOTPRINT branch.
The only difference for fixed-point is:
celt_fir(): the sum is truncated first and then accumulated to x[i] and
saturated.
celt_fir_permit_overflow(): x[i] is accumulated to the sum first and then
truncated saturated.
Maybe this is the reason why silk_LPC_analysis_filter() switched the FIR
from celt_fir() to celt_fir_permit_overflow() half a year ago.
Because of silk_LPC_analysis_filter(), celt_fir_permit_overflow() must
behave the same for both floating-point and fixed-point, and this is why we
defined ADD32_FIXED(), ..., PSHR32_FIXED() etc.
It's still a mes...
2017 Feb 18
0
[PATCH] Refactor silk_LPC_analysis_filter() & Optimize celt_fir_permit_overflow() for ARM NEON
...se in the
SMALL_FOOTPRINT case it first gets shifted up by SIG_SHIFT, the result
of the downshift (also by SIG_SHIFT) is the same no matter when it gets
added. That being said, I thought adding at the beginning was nicer so I
changed the remaining code to do that.
> Maybe this is the reason why silk_LPC_analysis_filter() switched the FIR
> from celt_fir() to celt_fir_permit_overflow() half a year ago.
No, it's another issue. silk_LPC_analysis_filter() was always bit-exact
with celt_fir(), with or without SMALL_FOOTPRINT. The only difference
lies with the signed integer overflow suppression.
silk_LPC_analy...
2016 Jul 28
0
[PATCH] Optimize silk_LPC_analysis_filter() for ARM NEON
...ysis_filter.c
+++ b/silk/LPC_analysis_filter.c
@@ -44,9 +44,8 @@ POSSIBILITY OF SUCH DAMAGE.
current implementation silences by casting to unsigned. Enabling
this should be safe in pretty much all cases, even though it is not technically
C89-compliant. */
-#define USE_CELT_FIR 0
-void silk_LPC_analysis_filter(
+void silk_LPC_analysis_filter_c(
opus_int16 *out, /* O Output signal */
const opus_int16 *in, /* I Input signal */
const opus_...
2017 Feb 15
0
[PATCH] Refactor silk_LPC_analysis_filter() & Optimize celt_fir_permit_overflow() for ARM NEON
...Can you give me a bit more details about the purpose of this patchset.
It seems to me like it's mostly duplicating the celt_fir()
optimizations? Did I miss anything?
Cheers,
Jean-Marc
On 15/02/17 02:22 PM, Linfeng Zhang wrote:
> Hi,
>
> Attached are two patches. Patch 1 refactors silk_LPC_analysis_filter().
> And Patch 2 optimizes the new function celt_fir_permit_overflow() for
> ARM NEON.
>
> Please recommend a better function name.
>
> We did the same internal code review and testing already.
>
> Thanks,
> Linfeng
>
>
>
> ______________________________...
2017 Feb 15
0
[PATCH] Refactor silk_LPC_analysis_filter() & Optimize celt_fir_permit_overflow() for ARM NEON
Hi Linfeng,
On 15/02/17 02:22 PM, Linfeng Zhang wrote:
> Attached are two patches. Patch 1 refactors silk_LPC_analysis_filter().
> And Patch 2 optimizes the new function celt_fir_permit_overflow() for
> ARM NEON.
>
> Please recommend a better function name.
In most other cases, we've just added the _ovflw() suffix to
functions/macros where signed overflow is allowed (suppressed using
unsigned cast).
>...
2017 Mar 01
2
[PATCH] Refactor silk_LPC_analysis_filter() & Optimize celt_fir_permit_overflow() for ARM NEON
...n when USE_CELT_FIR=0.
xcorr_kernel() itself is great and provides many gains. The only issue is
that calling it in a for loop makes it less efficient.
xcorr_kernel() is called in several functions including
celt_fir(), celt_pitch_xcorr() and celt_iir(). All these functions are not
heavy hitters.
silk_LPC_analysis_filter()'s CPU cycles are 6.8% with complexity 8 and 8.9%
with complexity 5 out of the whole encoder. It probably makes sense to have
a specific optimization to not calling xcorr_kernel() too many times to
save 1% to 1.5% CPU cycles here.
How do you think?
Thanks,
Linfeng
> We can then switch b...
2017 Mar 01
0
[PATCH] Refactor silk_LPC_analysis_filter() & Optimize celt_fir_permit_overflow() for ARM NEON
Linfeng Zhang wrote:
> xcorr_kernel() itself is great and provides many gains. The only issue
> is that calling it in a for loop makes it less efficient.
Do you think it would be possible to improve the API of xcorr_kernel()
so that calling it in a loop is more efficient?
I haven't looked at an instruction-level profile, but I find it hard to
believe that the function
2017 Mar 02
0
Antw: Re: [PATCH] Refactor silk_LPC_analysis_filter() & Optimize celt_fir_permit_overflow() for ARM NEON
Hi!
I'm not deep i the code, but from my experience even older gcc (4.3.4) does function inlining at -O2, and at -O3 it inlines almost any function inside one module. Once I even let it inline across modules (-combine). I'm not talking about explicit inline functions; just about automatic optimization.
So did you check that frequent function calls actually happen? I'm a bit afraid
2017 Mar 01
3
[PATCH] Refactor silk_LPC_analysis_filter() & Optimize celt_fir_permit_overflow() for ARM NEON
Hi Timothy,
Do you think it would be possible to improve the API of xcorr_kernel() so
> that calling it in a loop is more efficient?
>
If it could be inlined, it will be more efficient. Besides memory bouncing,
frequent function call is expensive.
The other advantage to wiring up xcorr_kernel() is that it applies in more
> places than your intrinsics-only celt_fir() implementation.
2016 Jun 17
5
ARM NEON optimization -- celt_fir()
Hi all,
This is Linfeng Zhang from Google. I'll work on ARM NEON optimization in the
next few months.
I'm submitting 2 patches in the following couple of emails, which have the new
created celt_fir_neon().
I revised celt_fir_c() to not pass in argument "mem" in Patch 1. If there are
concerns to this change, please let me know.
Many thanks to your comments.
Linfeng Zhang
2016 Jul 14
6
Several patches of ARM NEON optimization
I rebased my previous 3 patches to the current master with minor changes.
Patches 1 to 3 replace all my previous submitted patches.
Patches 4 and 5 are new.
Thanks,
Linfeng Zhang
2016 Aug 23
0
[PATCH 8/8] Optimize silk_NSQ_del_dec() for ARM NEON
...subfr = 0;
+ }
+
+ /* Rewhiten with new A coefs */
+ start_idx = psEncC->ltp_mem_length - lag - psEncC->predictLPCOrder - LTP_ORDER / 2;
+ silk_assert( start_idx > 0 );
+
+ silk_LPC_analysis_filter( &sLTP[ start_idx ], &NSQ->xq[ start_idx + k * psEncC->subfr_length ],
+ A_Q12, psEncC->ltp_mem_length - start_idx, psEncC->predictLPCOrder, psEncC->arch );
+
+ NSQ->sLTP_buf_idx = psEncC->ltp_mem_length;
+ NS...
2016 Aug 23
2
[PATCH 7/8] Update NSQ_LPC_BUF_LENGTH macro.
NSQ_LPC_BUF_LENGTH is independent of DECISION_DELAY.
---
silk/define.h | 4 ----
1 file changed, 4 deletions(-)
diff --git a/silk/define.h b/silk/define.h
index 781cfdc..1286048 100644
--- a/silk/define.h
+++ b/silk/define.h
@@ -173,11 +173,7 @@ extern "C"
#define MAX_MATRIX_SIZE MAX_LPC_ORDER /* Max of LPC Order and LTP order */
-#if( MAX_LPC_ORDER >