thr3ads.net - similar to: "[PATCH] Optimize silk_LPC_inverse_pred

Displaying 20 results from an estimated 200 matches similar to: "[PATCH] Optimize silk_LPC_inverse_pred_gain() for ARM NEON"

[PATCH] Optimize silk_LPC_inverse_pred_gain() for ARM NEON

2017 Feb 13

[PATCH] Optimize silk_LPC_inverse_pred_gain() for ARM NEON

Hi Jean-Marc, Yes I confirm that we have done the same internal review on this patch. For 1), I agree that an explicit unit test would be a good plus to cover the cases that "make check" cannot trigger. If you like, we may submit an unit test patch for code review. Thanks, Linfeng On Thu, Feb 9, 2017 at 4:48 PM Jean-Marc Valin <jmvalin at jmvalin.ca> wrote: > Hi Linfeng,

[PATCH] Optimize silk_LPC_inverse_pred_gain() for ARM NEON

2017 Feb 15

[PATCH] Optimize silk_LPC_inverse_pred_gain() for ARM NEON

Hi Jean-Marc, (forgot cc opus@) Thanks for creating the unit test code. Attached is the updated optimization patch. On Mon, Feb 13, 2017 at 10:17 AM, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote: > On 13/02/17 01:09 PM, Linfeng Zhang wrote: > > For 1), I agree that an explicit unit test would be a good plus to cover > > the cases that "make check" cannot

[PATCH] Optimize silk_LPC_inverse_pred_gain() for ARM NEON

2017 Feb 10

[PATCH] Optimize silk_LPC_inverse_pred_gain() for ARM NEON

Hi Linfeng, Can you confirm that you the patch went through the same internal review (presumably from James) than the previous ones? I had a look and did some testing and it looked good to me. There's only two issues I'd like to resolve first -- none of which directly related to your code. 1) The overflow condition is essentially untested because none of the tests in "make

[PATCH] Optimize silk_LPC_inverse_pred_gain() for ARM NEON

2017 Feb 13

[PATCH] Optimize silk_LPC_inverse_pred_gain() for ARM NEON

On 13/02/17 01:09 PM, Linfeng Zhang wrote: > For 1), I agree that an explicit unit test would be a good plus to cover > the cases that "make check" cannot trigger. If you like, we may submit > an unit test patch for code review. Yes, please include a unit test that triggers the overflow detection. Once that works, I think we can merge this optimization. Cheers, Jean-Marc

[PATCH] Optimize silk_LPC_inverse_pred_gain() for ARM NEON

2017 Feb 15

[PATCH] Optimize silk_LPC_inverse_pred_gain() for ARM NEON

Hi Linfeng, Thanks for the updated patch. Just pushed it to master. One thing that still bothers me a bit is that the if( ( max > 0 ) || ( min < -1 ) ) line is still pretty much untested. By which I mean that if I remove the condition, then the tests (including the new unit tests) still pass. I wasn't able to figure out a case that triggers it -- and I'm not even 100% sure it's

Several patches of ARM NEON optimization

2016 Jul 14

Several patches of ARM NEON optimization

I rebased my previous 3 patches to the current master with minor changes. Patches 1 to 3 replace all my previous submitted patches. Patches 4 and 5 are new. Thanks, Linfeng Zhang

[PATCH] Add silk/tests/test_unit_optimization_LPC_inv_pred_gain

2017 Feb 14

[PATCH] Add silk/tests/test_unit_optimization_LPC_inv_pred_gain

Hi, Attached is a patch with silk/tests/test_unit_optimization_LPC_inv_pred_gain which does the unit test of silk_LPC_inverse_pred_gain() optimizations. Please review. The testing loop number is set to 10,000, since all branches in this function get hit after 9,085 loops of random inputs. Thanks, Linfeng -------------- next part -------------- An HTML attachment was scrubbed... URL:

celt_inner_prod() and dual_inner_prod() NEON intrinsics

2017 Jun 05

celt_inner_prod() and dual_inner_prod() NEON intrinsics

Hi Jean-Marc, I attached the new version in inner_prod_5patches_v2.zip which synced to the current master. For fixed-point ARM, only 0003-Optimize-fixed-point-celt _inner_prod-and-dual_inner_.patch changes the performance. For floating-point ARM, only 0004-Optimize-floating-point-c elt_inner_prod-and-dual_inn.patch changes the performance. Patch 1 and 2 are code clean-up and can only affect x86

celt_inner_prod() and dual_inner_prod() NEON intrinsics

2017 Jun 06

celt_inner_prod() and dual_inner_prod() NEON intrinsics

Hi Linfeng, On 05/06/17 03:31 PM, Linfeng Zhang wrote: > Yes we'll have one more patch set related to xcorr in next week. Please > don't wait if it's too late for 1.2 release. Assuming there's no issue with the patches, next week isn't too late. Also, I've started looking at your patches. So far there's one thing that puzzles me a bit. In the OPUS_CHECK_ASM

2017 Apr 19

2 patches related to silk_biquad_alt() optimization

Hi, Attached are 2 patches related to silk_biquad_alt() optimization. Please review. Thanks, Linfeng Zhang -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xiph.org/pipermail/opus/attachments/20170419/f08f5030/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name:

[PATCH] Refactor silk_LPC_analysis_filter() & Optimize celt_fir_permit_overflow() for ARM NEON

2017 Feb 15

[PATCH] Refactor silk_LPC_analysis_filter() & Optimize celt_fir_permit_overflow() for ARM NEON

Hi, Attached are two patches. Patch 1 refactors silk_LPC_analysis_filter(). And Patch 2 optimizes the new function celt_fir_permit_overflow() for ARM NEON. Please recommend a better function name. We did the same internal code review and testing already. Thanks, Linfeng -------------- next part -------------- An HTML attachment was scrubbed... URL:

Antw: Re: celt_inner_prod() and dual_inner_prod() NEON intrinsics

2017 Jun 06

Antw: Re: celt_inner_prod() and dual_inner_prod() NEON intrinsics

>>> Linfeng Zhang <linfengz at google.com> schrieb am 06.06.2017 um 06:46 in Nachricht <CAKoqLCAfj+fDUMLfN4dLNSZ4NNAZpaSt_BWZRp+7XBqfhiSqiQ at mail.gmail.com>: > Hi Jean-Marc, > > I tried "==" before, and it failed when both results are 0.0. Maybe the > exponent or sign has difference because of the different 0.0 representation > in NEON. If anybody

[PATCH] cosmetics,silk: correct input/output arg comments

2017 Apr 19

[PATCH] cosmetics,silk: correct input/output arg comments

Hi, Attached is a patch for cosmetics purpose. Please review. Thanks, Linfeng Zhang -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xiph.org/pipermail/opus/attachments/20170419/34354707/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-cosmetics-silk-correct-input-output-arg-comments.patch

celt_inner_prod() and dual_inner_prod() NEON intrinsics

2017 Jun 01

celt_inner_prod() and dual_inner_prod() NEON intrinsics

Hi, Attached are 5 patches related to celt_inner_prod() and dual_inner_prod() NEON intrinsics optimization. In 0004-Optimize-floating-point-celt_inner_prod-and-dual_inn.patch, the optimization changed the order of floating-point inner products, which will change the results. I created celt_inner_prod_neon_float_c_simulation() and dual_inner_prod_neon_float_c_simulation() to simulate the order

[PATCH] Fix celt_pitch_xcorr ARM jump table compiling error

2017 Jul 21

[PATCH] Fix celt_pitch_xcorr ARM jump table compiling error

Hi, Attached is a fix related to ARM optimization jump table compiling error. Thanks, Linfeng Zhang -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xiph.org/pipermail/opus/attachments/20170720/661d96b5/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name:

Opus floating-point NEON jump table question

2017 May 31

Opus floating-point NEON jump table question

Hi, ./configure --build x86_64-unknown-linux-gnu --host arm-linux-gnueabihf --disable-assertions --disable-check-asm --enable-intrinsics CFLAGS=-O3 --disable-shared When configuring with floating-point and intrinsics enabled as above, the generated config.h only has OPUS_ARM_MAY_HAVE_NEON_INTR defined (to 1), with /* #undef OPUS_ARM_ASM */ /* #undef OPUS_ARM_INLINE_ASM */ /* #undef

Opus floating-point NEON jump table question

2017 Jun 01

Opus floating-point NEON jump table question

Thank Jean-Mark and Jonathan! I tested current OPUS encoder in floating-point with Complexity 8. Hacking using the attached patch (which will generate "#define OPUS_ARM_MAY_HAVE_NEON 1" in config.h) will speed up about 14.7% on my Chromebook. Probably it's because many NEON intrinsics optimizations can benefit both fixed-point and floating-point encoder. So if it's safe enough

celt_inner_prod() and dual_inner_prod() NEON intrinsics

2017 Jun 06

celt_inner_prod() and dual_inner_prod() NEON intrinsics

Hi Linfeng, On 06/06/17 04:09 PM, Jonathan Lennox wrote: > Two comments on the various infrastructure for RTCD etc. > > 1. The 0002- patch changes the ABI of the celt_pitch_xcorr functions, > but doesn’t change the assembly in celt/arm/celt_pitch_xcorr_arm.s > correspondingly. I suspect the ‘arch’ parameter can just be ignored > by the assembly functions, but at least the

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

2017 Apr 05

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

Thank Jean-Marc! The speedup percentages are all relative to the entire encoder. Comparing to master, this optimization patch speeds up fixed-point SILK encoder on NEON as following: Complexity 5: 6.1% Complexity 6: 5.8% Complexity 8: 5.5% Complexity 10: 4.0% when testing on an Acer Chromebook, ARMv7 Processor rev 3 (v7l), CPU max MHz: 2116.5 Thanks, Linfeng On Wed, Apr 5, 2017 at 11:02 AM,

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

2017 Apr 05

[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON

I attached a new patch with small cleanup (disassembly is identical as the last patch). We have done the same internal testing as usual. Also, attached 2 failed temporary versions which try to reduce code size (just for code review reference purpose). The new patch of silk_warped_autocorrelation_FIX_neon() has a code size of 3,228 bytes (with gcc). smaller_slower.c has a code size of 2,304

similar to: [PATCH] Optimize silk_LPC_inverse_pred_gain() for ARM NEON