Displaying 18 results from an estimated 18 matches for "celt_iir".
Did you mean:
celt_fir
2013 Jun 07
2
Bug fix in celt_lpc.c and some xcorr_kernel optimizations
Hi JM,
At line 221 in celt_lpc.c (the celt_iir function) I think you really
want the RESTORE_STACK statement to be before the #endif instead of
after it. Also, I couldn't help notice that your SSE code for
xcorr_kernel reads more than "len" elements of "_x". I don't know if
that's really a problem when runnin...
2013 Jun 07
2
Bug fix in celt_lpc.c and some xcorr_kernel optimizations
.... Your SSE version seems to
> also be slightly faster than mine -- probably due the the partial sums.
> As for the NEON code, it would be good to compare the performance with
> the code Aur?lien Zanelli posted at
> http://darkosphere.fr/public/0002-Add-optimized-NEON-version-of-celt_fir-celt_iir-and-.patch
>
> Cheers,
>
> Jean-Marc
>
>
2013 Jun 07
0
Bug fix in celt_lpc.c and some xcorr_kernel optimizations
...ey're in git now. Your SSE version seems to
also be slightly faster than mine -- probably due the the partial sums.
As for the NEON code, it would be good to compare the performance with
the code Aur?lien Zanelli posted at
http://darkosphere.fr/public/0002-Add-optimized-NEON-version-of-celt_fir-celt_iir-and-.patch
Cheers,
Jean-Marc
On 06/06/2013 08:07 PM, John Ridges wrote:
> Hi JM,
>
> At line 221 in celt_lpc.c (the celt_iir function) I think you really
> want the RESTORE_STACK statement to be before the #endif instead of
> after it. Also, I couldn't help notice that you...
2013 Jun 07
0
Bug fix in celt_lpc.c and some xcorr_kernel optimizations
...on seems to
>> also be slightly faster than mine -- probably due the the partial sums.
>> As for the NEON code, it would be good to compare the performance with
>> the code Aur?lien Zanelli posted at
>> http://darkosphere.fr/public/0002-Add-optimized-NEON-version-of-celt_fir-celt_iir-and-.patch
>>
>>
>> Cheers,
>>
>> Jean-Marc
>>
>>
>
>
>
2016 Jun 17
0
ARM NEON optimization -- celt_fir()
...otten around to reviewing). As they used Neon intrinsics, several of these actually applied to both armv7 and aarch64 Neon.
In particular, note http://lists.xiph.org/pipermail/opus/2015-December/003339.html , which added a Neon-optimized version of xcorr_kernel. xcorr_kernel is used in celt_fir, celt_iir, and celt_pitch_xcorr.
> On Jun 17, 2016, at 5:09 PM, Linfeng Zhang <linfengz at google.com> wrote:
>
> Hi all,
>
> This is Linfeng Zhang from Google. I'll work on ARM NEON optimization in the
> next few months.
>
> I'm submitting 2 patches in the following...
2015 Nov 21
0
[Aarch64 v2 08/18] Add Neon fixed-point implementation of xcorr_kernel.
Used for celt_pitch_xcorr on aarch64, and celt_fir and celt_iir on both armv7 and aarch64.
---
celt/arm/arm_celt_map.c | 17 +++++++++++++
celt/arm/celt_neon_intr.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++-
celt/arm/pitch_arm.h | 31 +++++++++++++++++++++++-
3 files changed, 107 insertions(+), 2 deletions(-)
diff --git a/celt/arm/arm_celt_m...
2017 Mar 01
2
[PATCH] Refactor silk_LPC_analysis_filter() & Optimize celt_fir_permit_overflow() for ARM NEON
...named silk_fir() with optimization to do the
calculation when USE_CELT_FIR=0.
xcorr_kernel() itself is great and provides many gains. The only issue is
that calling it in a for loop makes it less efficient.
xcorr_kernel() is called in several functions including
celt_fir(), celt_pitch_xcorr() and celt_iir(). All these functions are not
heavy hitters.
silk_LPC_analysis_filter()'s CPU cycles are 6.8% with complexity 8 and 8.9%
with complexity 5 out of the whole encoder. It probably makes sense to have
a specific optimization to not calling xcorr_kernel() too many times to
save 1% to 1.5% CPU cycle...
2013 Jun 07
1
Bug fix in celt_lpc.c and some xcorr_kernel optimizations
...>> also be slightly faster than mine -- probably due the the partial sums.
>>> As for the NEON code, it would be good to compare the performance with
>>> the code Aur?lien Zanelli posted at
>>> http://darkosphere.fr/public/0002-Add-optimized-NEON-version-of-celt_fir-celt_iir-and-.patch
>>>
>>>
>>> Cheers,
>>>
>>> Jean-Marc
>>>
>>>
>>
>>
>
2013 May 21
0
[PATCH] 02-
...for (j=0;j<ord;j++)
{
- sum += MULT16_16(num[j],mem[j]);
+ sum = MAC16_16(sum, num[j], mem[j]);
}
for (j=ord-1;j>=1;j--)
{
@@ -111,6 +116,7 @@ void celt_fir(const opus_val16 *x,
y[i] = ROUND16(sum, SIG_SHIFT);
}
}
+#endif
void celt_iir(const opus_val32 *x,
const opus_val16 *den,
@@ -136,6 +142,7 @@ void celt_iir(const opus_val32 *x,
}
}
+#ifndef OVERRIDE_CELT_AUTOCORR
void _celt_autocorr(
const opus_val16 *x, /* in: [0...n-1] samples x */
opus_val32 *ac, /* out...
2016 Jun 17
5
ARM NEON optimization -- celt_fir()
Hi all,
This is Linfeng Zhang from Google. I'll work on ARM NEON optimization in the
next few months.
I'm submitting 2 patches in the following couple of emails, which have the new
created celt_fir_neon().
I revised celt_fir_c() to not pass in argument "mem" in Patch 1. If there are
concerns to this change, please let me know.
Many thanks to your comments.
Linfeng Zhang
2013 May 21
2
[PATCH] 02-Add CELT filter optimizations
...for (j=0;j<ord;j++)
{
- sum += MULT16_16(num[j],mem[j]);
+ sum = MAC16_16(sum, num[j], mem[j]);
}
for (j=ord-1;j>=1;j--)
{
@@ -111,6 +116,7 @@ void celt_fir(const opus_val16 *x,
y[i] = ROUND16(sum, SIG_SHIFT);
}
}
+#endif
void celt_iir(const opus_val32 *x,
const opus_val16 *den,
@@ -136,6 +142,7 @@ void celt_iir(const opus_val32 *x,
}
}
+#ifndef OVERRIDE_CELT_AUTOCORR
void _celt_autocorr(
const opus_val16 *x, /* in: [0...n-1] samples x */
opus_val32 *ac, /* out...
2013 Jun 10
0
opus Digest, Vol 53, Issue 2
...>> also be slightly faster than mine -- probably due the the partial sums.
>>> As for the NEON code, it would be good to compare the performance with
>>> the code Aur?lien Zanelli posted at
>>> http://darkosphere.fr/public/0002-Add-optimized-NEON-version-of-celt_fir-celt_iir-and-.patch
>>>
>>>
>>> Cheers,
>>>
>>> Jean-Marc
>>>
>>>
>>
>>
>
------------------------------
Message: 2
Date: Sat, 8 Jun 2013 02:54:03 +0000 (UTC)
From: casey guan <guanxiansun at gmail.com>
Subject: [o...
2017 Feb 15
2
[PATCH] Refactor silk_LPC_analysis_filter() & Optimize celt_fir_permit_overflow() for ARM NEON
Hi Jean-Marc,
The original celt_fir() is a little bit messy. It has 2 branches chosen by
#ifdef SMALL_FOOTPRINT.
For floating-point, the 2 branches are identical (except the operation
sequence of accumulating x[i] to sum, which is not a big deal).
For fixed-point, the 2 branches are different. I separate them into 2
functions: the new celt_fir(), and celt_fir_permit_overflow() which is the
2013 May 23
2
ASM runtime detection and optimizations
...}
-#ifndef OVERRIDE_CELT_FIR
-void celt_fir(const opus_val16 *x,
+void celt_fir_c(const opus_val16 *x,
const opus_val16 *num,
opus_val16 *y,
int N,
@@ -116,7 +127,6 @@ void celt_fir(const opus_val16 *x,
y[i] = ROUND16(sum, SIG_SHIFT);
}
}
-#endif
void celt_iir(const opus_val32 *x,
const opus_val16 *den,
@@ -142,7 +152,6 @@ void celt_iir(const opus_val32 *x,
}
}
-#ifndef OVERRIDE_CELT_AUTOCORR
void _celt_autocorr(
const opus_val16 *x, /* in: [0...n-1] samples x */
opus_val32 *ac, /* out...
2015 Nov 07
12
[Aarch64 00/11] Patches to enable Aarch64 (arm64) optimizations, rebased to current master.
...es rebased to the current tip of Opus master.
They're largely the same as my previous patch set, with the addition
of the final one (the Neon fixed-point implementation of
xcorr_kernel). This replaces Viswanath's Neon fixed-point
celt_pitch_xcorr, since xcorr_kernel is used in celt_fir and celt_iir
as well.
These have been tested for correctness under qemu (including running
the test vectors), but not yet performance tested on a live aarch64
CPU (which will probably be an iPhone). I should be able to do this
Monday or Tuesday.
Jonathan Lennox (11):
Move ARM-specific macro overrides to ar...
2015 Dec 23
6
[AArch64 neon intrinsics v4 0/5] Rework Neon intrinsic code for Aarch64 patchset
Following Tim's comments, here are my reworked patches for the Neon intrinsic function patches of
of my Aarch64 patchset, i.e. replacing patches 5-8 of the v2 series. Patches 1-4 and 9-18 of the
old series still apply unmodified.
The one new (as opposed to changed) patch is the first one in this series, to add named constants
for the ARM architecture variants.
There are also some minor code
2015 Nov 21
12
[Aarch64 v2 00/18] Patches to enable Aarch64 (version 2)
As promised, here's a re-send of all my Aarch64 patches, following
comments by John Ridges.
Note that they actually affect more than just Aarch64 -- other than
the ones specifically guarded by AARCH64_NEON defines, the Neon
intrinsics all also apply on armv7; and the OPUS_FAST_INT64 patches
apply on any 64-bit machine.
The patches should largely be independent and independently useful,
other
2016 Jul 14
6
Several patches of ARM NEON optimization
I rebased my previous 3 patches to the current master with minor changes.
Patches 1 to 3 replace all my previous submitted patches.
Patches 4 and 5 are new.
Thanks,
Linfeng Zhang