similar to: Fix ARM cpu selection if Intrinsics are enabled but not asm

Displaying 20 results from an estimated 3000 matches similar to: "Fix ARM cpu selection if Intrinsics are enabled but not asm"

2015 Jan 20
0
[RFC PATCH v1 1/2] Optimize repeated calls to opus_select_arch
Currently, opus_select_arch() is being called during initial setup of encoder/decoder structures and then stored. However, this "arch" variable does not always get passed to every function that may need it for architecture specific optimization. As a result, when a certain function is to be optimized for a particular architecture, we are having to change many function signatures in the
2013 May 23
2
ASM runtime detection and optimizations
I wrote a proof of concept regarding the cpu capabilities runtime detection and choice of optimized function. I follow design which had been discussed on IRC. Also, i notice a little drawback: we must propagate the arch index through functions which don't have codec state as argument. However, if it's look good, i will continue to implement it. Best regards, -- Aur?lien Zanelli
2017 Jun 02
2
Opus floating-point NEON jump table question
Thank Jonathan! I'll fix the MAY_HAVE_NEON() in silk/arm/arm_silk_map.c Linfeng On Thu, Jun 1, 2017 at 3:34 PM, Jonathan Lennox <jonathan at vidyo.com> wrote: > Semantically, OPUS_ARM_MAY_HAVE_NEON is supposed to mean the compiler > supports, and the CPU may support, Neon assembly code, which isn’t > necessarily the same thing as the compiler supporting Neon intrinsics. >
2015 Mar 04
2
Patch cleaning up Opus x86 intrinsics configury
On Mar 3, 2015, at 11:08 PM, Viswanath Puttagunta <viswanath.puttagunta at linaro.org<mailto:viswanath.puttagunta at linaro.org>> wrote: On 3 March 2015 at 21:59, Jonathan Lennox <jonathan at vidyo.com<mailto:jonathan at vidyo.com>> wrote: Viswenath, My patch should be against the tip, but it?s the very recent tip, including some changes this past Friday (27 Feb). I
2015 Mar 04
2
Patch cleaning up Opus x86 intrinsics configury
Viswenath, My patch should be against the tip, but it?s the very recent tip, including some changes this past Friday (27 Feb). I mentioned in the IRC room a problem I discovered in creating my patch, and then later improved the fix Tim had made for the problem. Where do you get conflicts merging it to tip? In terms of merging, you posted your patch before I posted mine, so probably I should be
2015 Mar 07
1
Patch cleaning up Opus x86 intrinsics configury
Hello Jonathan, Just FYI, I started doing review of your patch and will get back to you in few days. After review, I would like to rebase your patch (as necessary) myself and do some testing.. and re-submit. Regards, Vish On 4 March 2015 at 09:00, Viswanath Puttagunta <viswanath.puttagunta at linaro.org> wrote: > > On 3 March 2015 at 22:17, Jonathan Lennox <jonathan at
2017 Jun 06
2
celt_inner_prod() and dual_inner_prod() NEON intrinsics
Hi Linfeng, On 06/06/17 04:09 PM, Jonathan Lennox wrote: > Two comments on the various infrastructure for RTCD etc. > > 1. The 0002- patch changes the ABI of the celt_pitch_xcorr functions, > but doesn’t change the assembly in celt/arm/celt_pitch_xcorr_arm.s > correspondingly. I suspect the ‘arch’ parameter can just be ignored > by the assembly functions, but at least the
2017 Jun 02
0
[PATCH] Don't use MAY_HAVE_NEON in arm_silk_map.c.
It's unnecessary, and isn't defined correctly on floating-point. This makes us correctly use Neon functions (in floating-point mode) on platforms where Neon is detected by RTCD. --- silk/arm/arm_silk_map.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/silk/arm/arm_silk_map.c b/silk/arm/arm_silk_map.c index 53a60a0..04767b5 100644 ---
2014 Nov 25
2
[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics
On 25 November 2014 at 10:11, Viswanath Puttagunta <viswanath.puttagunta at linaro.org> wrote: > > On 25 November 2014 at 09:39, Jonathan Lennox <jonathan at vidyo.com> wrote: > > > > On Nov 25, 2014, at 10:07 AM, Viswanath Puttagunta <viswanath.puttagunta at linaro.org> wrote: > >> > >> > Also is there plans to make the NEON optimisations
2015 Mar 13
1
[RFC PATCH v3] Intrinsics/RTCD related fixes. Mostly x86.
From: Jonathan Lennox <jonathan at vidyo.com> * Makes ?enable-intrinsics work with clang and other non-GCC compilers * Enables RTCD for the floating-point-mode SSE code in Celt. * Disables use of RTCD in cases where the compiler targets an instruction set by default. * Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in floating-point mode, not
2015 Mar 12
1
[RFC PATCHv2] Intrinsics/RTCD related fixes. Mostly x86.
From: Jonathan Lennox <jonathan at vidyo.com> * Makes ?enable-intrinsics work with clang and other non-GCC compilers * Enables RTCD for the floating-point-mode SSE code in Celt. * Disables use of RTCD in cases where the compiler targets an instruction set by default. * Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in floating-point mode, not
2015 Nov 16
2
configure options for x86
Thanks for the prompt and helpful replies. Built with --enable-custom-modes --disable-static --enable-intrinsics --enable-rtcd --enable-float-approx All worked. Only thing odd, rtcd was not enabled: Floating point support: ........ yes Fast float approximations: ..... yes Fixed point debugging: ......... no Inline Assembly Optimizations: . No inline ASM for your
2014 Nov 25
1
[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics
On 25 November 2014 at 10:18, Jonathan Lennox <jonathan at vidyo.com> wrote: > > On Nov 25, 2014, at 11:13 AM, Viswanath Puttagunta > <viswanath.puttagunta at linaro.org> wrote: > > On 25 November 2014 at 10:11, Viswanath Puttagunta > <viswanath.puttagunta at linaro.org> wrote: > > > On 25 November 2014 at 09:39, Jonathan Lennox <jonathan at
2015 Mar 02
13
Patch cleaning up Opus x86 intrinsics configury
The attached patch cleans up Opus's x86 intrinsics configury. It: * Makes ?enable-intrinsics work with clang and other non-GCC compilers * Enables RTCD for the floating-point-mode SSE code in Celt. * Disables use of RTCD in cases where the compiler targets an instruction set by default. * Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in
2017 Jun 01
2
Opus floating-point NEON jump table question
Thank Jean-Mark and Jonathan! I tested current OPUS encoder in floating-point with Complexity 8. Hacking using the attached patch (which will generate "#define OPUS_ARM_MAY_HAVE_NEON 1" in config.h) will speed up about 14.7% on my Chromebook. Probably it's because many NEON intrinsics optimizations can benefit both fixed-point and floating-point encoder. So if it's safe enough
2015 Dec 23
6
[AArch64 neon intrinsics v4 0/5] Rework Neon intrinsic code for Aarch64 patchset
Following Tim's comments, here are my reworked patches for the Neon intrinsic function patches of of my Aarch64 patchset, i.e. replacing patches 5-8 of the v2 series. Patches 1-4 and 9-18 of the old series still apply unmodified. The one new (as opposed to changed) patch is the first one in this series, to add named constants for the ARM architecture variants. There are also some minor code
2015 Mar 04
0
Patch cleaning up Opus x86 intrinsics configury
On 3 March 2015 at 22:17, Jonathan Lennox <jonathan at vidyo.com> wrote: > > On Mar 3, 2015, at 11:08 PM, Viswanath Puttagunta > <viswanath.puttagunta at linaro.org> wrote: > > > > On 3 March 2015 at 21:59, Jonathan Lennox <jonathan at vidyo.com> wrote: >> >> Viswenath, >> >> My patch should be against the tip, but it?s the very recent
2015 Mar 04
0
Patch cleaning up Opus x86 intrinsics configury
On 3 March 2015 at 21:59, Jonathan Lennox <jonathan at vidyo.com> wrote: > Viswenath, > > My patch should be against the tip, but it?s the very recent tip, > including some changes this past Friday (27 Feb). I mentioned in the IRC > room a problem I discovered in creating my patch, and then later improved > the fix Tim had made for the problem. Where do you get conflicts
2015 Jan 20
0
[RFC PATCH v1 2/2] armv7(float): Optimize encode usecase using NE10 library
Optimize opus encode (float only) usecase using ARM NE10 library. Mainly effects opus_fft and ctl_mdct_forward and related functions. This optimization can be used for ARM CPUs that have NEON VFP unit. This patch only enables optimizations for ARMv7. Official ARM NE10 library page available at http://projectne10.github.io/Ne10/ To enable this optimization, use --enable-intrinsics
2014 Nov 25
4
[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics
On Nov 25, 2014, at 10:07 AM, Viswanath Puttagunta <viswanath.puttagunta at linaro.org> wrote: > > > Also is there plans to make the NEON optimisations on ARMv7 run time > > detectable like they have in cairo/pixman? For generic distributions > > it would nice to be able to be able to enable them as they offer > > decent performance improvements but have the code