search for: linfeng

Displaying 20 results from an estimated 86 matches for "linfeng".

2017 Jun 06
3
celt_inner_prod() and dual_inner_prod() NEON intrinsics
Hi Linfeng, On 05/06/17 03:31 PM, Linfeng Zhang wrote: > Yes we'll have one more patch set related to xcorr in next week. Please > don't wait if it's too late for 1.2 release. Assuming there's no issue with the patches, next week isn't too late. Also, I've started looking at y...
2017 Jun 05
4
celt_inner_prod() and dual_inner_prod() NEON intrinsics
...nner_.patch changes the performance. For floating-point ARM, only 0004-Optimize-floating-point-c elt_inner_prod-and-dual_inn.patch changes the performance. Patch 1 and 2 are code clean-up and can only affect x86 performance. Patch 5 has neglectable effect on floating-point ARM performance. Thanks, Linfeng On Fri, Jun 2, 2017 at 11:26 AM, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote: > Hi Linfeng, > > I'll look into your patches. Can you let me know what's the expected > effect on performance (if any) for each of your patches? Also, are these > all the patches you inte...
2017 Feb 15
2
[PATCH] Optimize silk_LPC_inverse_pred_gain() for ARM NEON
Hi Jean-Marc, (forgot cc opus@) Thanks for creating the unit test code. Attached is the updated optimization patch. On Mon, Feb 13, 2017 at 10:17 AM, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote: > On 13/02/17 01:09 PM, Linfeng Zhang wrote: > > For 1), I agree that an explicit unit test would be a good plus to cover > > the cases that "make check" cannot trigger. If you like, we may submit > > an unit test patch for code review. > > Yes, please include a unit test that triggers the overfl...
2017 Apr 19
4
2 patches related to silk_biquad_alt() optimization
Hi, Attached are 2 patches related to silk_biquad_alt() optimization. Please review. Thanks, Linfeng Zhang -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xiph.org/pipermail/opus/attachments/20170419/f08f5030/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: 0002-Optimize-silk_biquad_alt-for...
2017 Jun 06
4
Antw: Re: celt_inner_prod() and dual_inner_prod() NEON intrinsics
>>> Linfeng Zhang <linfengz at google.com> schrieb am 06.06.2017 um 06:46 in Nachricht <CAKoqLCAfj+fDUMLfN4dLNSZ4NNAZpaSt_BWZRp+7XBqfhiSqiQ at mail.gmail.com>: > Hi Jean-Marc, > > I tried "==" before, and it failed when both results are 0.0. Maybe the > exponent or sign has d...
2017 Feb 13
2
[PATCH] Optimize silk_LPC_inverse_pred_gain() for ARM NEON
Hi Jean-Marc, Yes I confirm that we have done the same internal review on this patch. For 1), I agree that an explicit unit test would be a good plus to cover the cases that "make check" cannot trigger. If you like, we may submit an unit test patch for code review. Thanks, Linfeng On Thu, Feb 9, 2017 at 4:48 PM Jean-Marc Valin <jmvalin at jmvalin.ca> wrote: > Hi Linfeng, > > Can you confirm that you the patch went through the same internal review > (presumably from James) than the previous ones? > > I had a look and did some testing and it looked go...
2017 Feb 15
4
[PATCH] Refactor silk_LPC_analysis_filter() & Optimize celt_fir_permit_overflow() for ARM NEON
Hi, Attached are two patches. Patch 1 refactors silk_LPC_analysis_filter(). And Patch 2 optimizes the new function celt_fir_permit_overflow() for ARM NEON. Please recommend a better function name. We did the same internal code review and testing already. Thanks, Linfeng -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xiph.org/pipermail/opus/attachments/20170215/c44c8029/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: 0002-Optimize-celt_fir_permit_overflow-...
2017 Jun 06
0
celt_inner_prod() and dual_inner_prod() NEON intrinsics
...and it failed when both results are 0.0. Maybe the exponent or sign has difference because of the different 0.0 representation in NEON. If anybody know how to handle this 0.0 comparison, that would be great. Or just use if(a==b || (a==0.0 && b==0.0)) ... but I haven't try this. Thanks, Linfeng On Mon, Jun 5, 2017 at 8:43 PM Jean-Marc Valin <jmvalin at jmvalin.ca> wrote: > Hi Linfeng, > > On 05/06/17 03:31 PM, Linfeng Zhang wrote: > > Yes we'll have one more patch set related to xcorr in next week. Please > > don't wait if it's too late for 1.2 rel...
2017 Apr 25
2
2 patches related to silk_biquad_alt() optimization
...and I can remove the optimization of stride 1 case. If it's allowed to skip the split of A_Q28 and replace by 32-bit multiplication (result is 64-bit), probably it could be faster on NEON. This may change the encoder results because of different order of adding, shifting and rounding. Thanks, Linfeng On Wed, Apr 19, 2017 at 10:23 PM, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote: > Hi Linfeng, > > Thanks for the patches. I'll have a look and get back to you. What kind > of speedup are you getting for these functions? On what command line? > > Cheers, > >...
2017 Apr 19
3
[PATCH] cosmetics,silk: correct input/output arg comments
Hi, Attached is a patch for cosmetics purpose. Please review. Thanks, Linfeng Zhang -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xiph.org/pipermail/opus/attachments/20170419/34354707/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-cosmetics-silk-correct-input-outp...
2017 Apr 05
4
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
...e to the entire encoder. Comparing to master, this optimization patch speeds up fixed-point SILK encoder on NEON as following: Complexity 5: 6.1% Complexity 6: 5.8% Complexity 8: 5.5% Complexity 10: 4.0% when testing on an Acer Chromebook, ARMv7 Processor rev 3 (v7l), CPU max MHz: 2116.5 Thanks, Linfeng On Wed, Apr 5, 2017 at 11:02 AM, Jean-Marc Valin <jmvalin at jmvalin.ca> wrote: > Hi Linfeng, > > Thanks for the updated patch. I'll have a look and get back to you. When > you report speedup percentages, is that relative to the entire encoder > or relative to just that f...
2017 Apr 05
2
[PATCH] Optimize silk_warped_autocorrelation_FIX() for ARM NEON
...The new patch of silk_warped_autocorrelation_FIX_neon() has a code size of 3,228 bytes (with gcc). smaller_slower.c has a code size of 2,304 bytes, but the encoder is about 1.8% - 2.7% slower. smallest_slowest.c has a code size of 1,656 bytes, but the encoder is about 2.3% - 3.6% slower. Thanks, Linfeng On Mon, Apr 3, 2017 at 3:01 PM, Linfeng Zhang <linfengz at google.com> wrote: > Hi Jean-Marc, > > Attached is the silk_warped_autocorrelation_FIX_neon() which implements > your idea. > > Speed improvement vs the previous optimization: > > Complexity 0-4: Doesn't...
2017 Jun 01
4
celt_inner_prod() and dual_inner_prod() NEON intrinsics
...nable small number or ratio. It's easy to create an input which 0 and 1,000 are both correct results by just manipulating the inner product order. The total speed gain is about 1.0% for fixed-point encoder, and 1.8% for floating-point encoder, in Complexity 8, tested on my Chromebook. Thanks, Linfeng -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xiph.org/pipermail/opus/attachments/20170601/92c39072/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: 0005-Clean-celt_pitch_xcorr_float_neon....
2017 Jun 01
2
Opus floating-point NEON jump table question
...about 14.7% on my Chromebook. Probably it's because many NEON intrinsics optimizations can benefit both fixed-point and floating-point encoder. So if it's safe enough to enable MAY_HAVE_NEON in floating-point by default, it could speed up floating-point NEON encoder a little bit. Thanks, Linfeng On Thu, Jun 1, 2017 at 2:22 PM, Jonathan Lennox <jonathan at vidyo.com> wrote: > > On May 31, 2017, at 12:47 PM, Linfeng Zhang <linfengz at google.com> wrote: > > Hi, > > ./configure --build x86_64-unknown-linux-gnu --host arm-linux-gnueabihf > --disable-assertion...
2017 Jul 21
2
[PATCH] Fix celt_pitch_xcorr ARM jump table compiling error
Hi, Attached is a fix related to ARM optimization jump table compiling error. Thanks, Linfeng Zhang -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xiph.org/pipermail/opus/attachments/20170720/661d96b5/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-Fix-celt_pitch_xcorr-ARM-jump-tab...
2017 Jun 06
2
celt_inner_prod() and dual_inner_prod() NEON intrinsics
Hi Linfeng, On 06/06/17 04:09 PM, Jonathan Lennox wrote: > Two comments on the various infrastructure for RTCD etc. > > 1. The 0002- patch changes the ABI of the celt_pitch_xcorr functions, > but doesn’t change the assembly in celt/arm/celt_pitch_xcorr_arm.s > correspondingly. I suspect the...
2017 Feb 07
3
[PATCH] Optimize silk_LPC_inverse_pred_gain() for ARM NEON
Hi, Attached is a patch with arm neon optimizations for silk_LPC_inverse_pred_gain(). Please review. Thanks, Linfeng -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.xiph.org/pipermail/opus/attachments/20170207/5c5ab508/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-Optimize-silk_LPC_inverse_pred_gai...
2017 Feb 15
0
[PATCH] Optimize silk_LPC_inverse_pred_gain() for ARM NEON
Hi Linfeng, Thanks for the updated patch. Just pushed it to master. One thing that still bothers me a bit is that the if( ( max > 0 ) || ( min < -1 ) ) line is still pretty much untested. By which I mean that if I remove the condition, then the tests (including the new unit tests) still pass. I wasn...
2017 Jun 06
0
celt_inner_prod() and dual_inner_prod() NEON intrinsics
...s not IEEE 754-compliant here. :) Since x[i] == y[i] in both cases, they are actually calculating the energy. (-1.000134e-22 * -1.000134e-22) is larger than the smallest single-precision number and should be represented as none-zero (such as 0x8). I don't know why NEON gives 0 result. Thanks, Linfeng On Tue, Jun 6, 2017 at 12:03 AM, Ulrich Windl <Ulrich.Windl at rz.uni-regensbur g.de> wrote: > >>> Linfeng Zhang <linfengz at google.com> schrieb am 06.06.2017 um 06:46 in > Nachricht > <CAKoqLCAfj+fDUMLfN4dLNSZ4NNAZpaSt_BWZRp+7XBqfhiSqiQ at mail.gmail.com>: &g...
2017 Apr 24
2
2 patches related to silk_biquad_alt() optimization
Hi Ulrich, As Jean-mark recommended, we created "--enable-check-asm" config option to active OPUS_CHECK_ASM macros in the optimization, where the C function is called inside and the results of C and optimization functions are compared when encoding/decoding the real audio files. Thanks, Linfeng On Wed, Apr 19, 2017 at 11:46 PM, Ulrich Windl < Ulrich.Windl at rz.uni-regensburg.de> wrote: > >>> Linfeng Zhang <linfengz at google.com> schrieb am 19.04.2017 um 18:29 in > Nachricht > <CAKoqLCDX3eCUGbnZFvRzhiCV1Mbo2ksbj8K+pcVu60Dvit7WCQ at mail.gmail.com>: &...