thr3ads.net - similar to: "[Aarch64 00/11] Patches to enable Aarch64"

Displaying 20 results from an estimated 2000 matches similar to: "[Aarch64 00/11] Patches to enable Aarch64"

[Aarch64 00/11] Patches to enable Aarch64

2015 Nov 10

[Aarch64 00/11] Patches to enable Aarch64

Good to know. Thank-you for the test. On 11/10/2015 2:37 PM, Jonathan Lennox wrote: >> On Nov 10, 2015, at 3:45 PM, John Ridges <jridges at masque.com> wrote: >> >> Since you're already set up for benchmarks, I would ask if you could >> benchmark the difference between using and not using the ARM64 inline >> assembly. I believe the original justification

[Aarch64 00/11] Patches to enable Aarch64

2015 Nov 10

[Aarch64 00/11] Patches to enable Aarch64

> On Nov 10, 2015, at 3:45 PM, John Ridges <jridges at masque.com> wrote: > > Since you're already set up for benchmarks, I would ask if you could > benchmark the difference between using and not using the ARM64 inline > assembly. I believe the original justification on ARMv7 for the assembly > was the processor's panoply of multiply instructions and their long

[Aarch64 00/11] Patches to enable Aarch64

2015 Nov 13

[Aarch64 00/11] Patches to enable Aarch64

Thanks, I look forward to seeing what you find out. BTW, I was wondering if you tried replacing the SIG2WORD16 macro using the vqmovns_s32 intrinsic? I'm sure it would be faster than the C code, but in the grand scheme of things it might not make much difference. On 11/13/2015 12:15 PM, Jonathan Lennox wrote: >> On Nov 13, 2015, at 1:51 PM, John Ridges <jridges at masque.com>

Patch cleaning up Opus x86 intrinsics configury

2015 Mar 04

Patch cleaning up Opus x86 intrinsics configury

On 3 March 2015 at 22:17, Jonathan Lennox <jonathan at vidyo.com> wrote: > > On Mar 3, 2015, at 11:08 PM, Viswanath Puttagunta > <viswanath.puttagunta at linaro.org> wrote: > > > > On 3 March 2015 at 21:59, Jonathan Lennox <jonathan at vidyo.com> wrote: >> >> Viswenath, >> >> My patch should be against the tip, but it?s the very recent

[Aarch64 00/11] Patches to enable Aarch64

2015 Nov 13

[Aarch64 00/11] Patches to enable Aarch64

Hi Jonathan, I'm sorry to bring this up again, and I don't want to beat a dead horse, but I was very surprised by your benchmarks so I took a little closer look. I think what's happening is that it's a little unfair to compare the ARM64 inline assembly to the C code, because looking at the C macros in "fixed_generic.h" for MULT16_32_Q16 and MULT16_32_Q15 you find

Patch cleaning up Opus x86 intrinsics configury

2015 Mar 07

Patch cleaning up Opus x86 intrinsics configury

Hello Jonathan, Just FYI, I started doing review of your patch and will get back to you in few days. After review, I would like to rebase your patch (as necessary) myself and do some testing.. and re-submit. Regards, Vish On 4 March 2015 at 09:00, Viswanath Puttagunta <viswanath.puttagunta at linaro.org> wrote: > > On 3 March 2015 at 22:17, Jonathan Lennox <jonathan at

Patch cleaning up Opus x86 intrinsics configury

2015 Mar 04

Patch cleaning up Opus x86 intrinsics configury

On Mar 3, 2015, at 11:08 PM, Viswanath Puttagunta <viswanath.puttagunta at linaro.org<mailto:viswanath.puttagunta at linaro.org>> wrote: On 3 March 2015 at 21:59, Jonathan Lennox <jonathan at vidyo.com<mailto:jonathan at vidyo.com>> wrote: Viswenath, My patch should be against the tip, but it?s the very recent tip, including some changes this past Friday (27 Feb). I

Patch cleaning up Opus x86 intrinsics configury

2015 Mar 04

Patch cleaning up Opus x86 intrinsics configury

On 3 March 2015 at 21:59, Jonathan Lennox <jonathan at vidyo.com> wrote: > Viswenath, > > My patch should be against the tip, but it?s the very recent tip, > including some changes this past Friday (27 Feb). I mentioned in the IRC > room a problem I discovered in creating my patch, and then later improved > the fix Tim had made for the problem. Where do you get conflicts

[Aarch64 00/11] Patches to enable Aarch64

2015 Nov 19

[Aarch64 00/11] Patches to enable Aarch64

> On Nov 16, 2015, at 4:42 PM, Jonathan Lennox <jonathan at vidyo.com> wrote: > > I haven?t yet tried replacing SIG2WORD16 (or silk_ADD_SAT32/silk_SUB_SAT32) with Neon intrinsics. That?s an obvious next step. This doesn?t show any appreciable speed difference in my tests, but the code is obviously better by inspection (all three of these map directly to a single Aarch64

Patch cleaning up Opus x86 intrinsics configury

2015 Mar 04

Patch cleaning up Opus x86 intrinsics configury

Viswenath, My patch should be against the tip, but it?s the very recent tip, including some changes this past Friday (27 Feb). I mentioned in the IRC room a problem I discovered in creating my patch, and then later improved the fix Tim had made for the problem. Where do you get conflicts merging it to tip? In terms of merging, you posted your patch before I posted mine, so probably I should be

[Aarch64 00/11] Patches to enable Aarch64

2015 Nov 20

[Aarch64 00/11] Patches to enable Aarch64

> On Nov 19, 2015, at 5:47 PM, John Ridges <jridges at masque.com> wrote: > > Any speedup from the intrinsics may just be swamped by the rest of the encode/decode process. But I think you really want SIG2WORD16 to be (vqmovns_s32(PSHR32((x), SIG_SHIFT))) Yes, you?re right. I forgot to run the vectors under qemu with my previous version (oh, the embarrassment!) Fixed forthcoming

[Aarch64 00/11] Patches to enable Aarch64

2015 Nov 12

[Aarch64 00/11] Patches to enable Aarch64

One other minor thing: I notice that in the inline assembly the result (rd) is constrained as an earlyclobber operand. What was the reason for that?

[Aarch64 v2 10/18] Clean up some intrinsics-related wording in configure.

2015 Nov 21

[Aarch64 v2 10/18] Clean up some intrinsics-related wording in configure.

--- configure.ac | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/configure.ac b/configure.ac index f52d2c2..e1a6e9b 100644 --- a/configure.ac +++ b/configure.ac @@ -190,7 +190,7 @@ AC_ARG_ENABLE([rtcd], [enable_rtcd=yes]) AC_ARG_ENABLE([intrinsics], - [AS_HELP_STRING([--disable-intrinsics], [Disable intrinsics optimizations for ARM(float) X86(fixed)])],, +

[Aarch64 00/11] Patches to enable Aarch64

2015 Nov 19

[Aarch64 00/11] Patches to enable Aarch64

Any speedup from the intrinsics may just be swamped by the rest of the encode/decode process. But I think you really want SIG2WORD16 to be (vqmovns_s32(PSHR32((x), SIG_SHIFT))) On 11/19/2015 2:52 PM, Jonathan Lennox wrote: >> On Nov 16, 2015, at 4:42 PM, Jonathan Lennox <jonathan at vidyo.com> wrote: >> >> I haven?t yet tried replacing SIG2WORD16 (or

[Aarch64 00/11] Patches to enable Aarch64

2015 Nov 16

[Aarch64 00/11] Patches to enable Aarch64

I?ve tried adding support for OPUS_FAST_INT64 to celt/arch.h, and I?ve found that this is indeed comparable in speed, if not a touch faster, than my inline assembly. I?ll submit patches for this. The inline assembly parts of my aarch64 patch set can thus be considered withdrawn. I haven?t yet tried replacing SIG2WORD16 (or silk_ADD_SAT32/silk_SUB_SAT32) with Neon intrinsics. That?s an obvious

[PATCH 1/3] Add configure check for Aarch64-specific Neon intrinsics.

2015 Nov 19

[PATCH 1/3] Add configure check for Aarch64-specific Neon intrinsics.

--- configure.ac | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/configure.ac b/configure.ac index 90a06c8..adcb969 100644 --- a/configure.ac +++ b/configure.ac @@ -503,6 +503,26 @@ AS_IF([test x"$enable_intrinsics" = x"yes"],[ [rtcd_support="$rtcd_support (NE10)"]) ]) + OPUS_CHECK_INTRINSICS( +

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

2014 Nov 25

[RFC PATCHv1] cover: celt_pitch_xcorr: Introduce ARM neon intrinsics

On 25 November 2014 at 10:18, Jonathan Lennox <jonathan at vidyo.com> wrote: > > On Nov 25, 2014, at 11:13 AM, Viswanath Puttagunta > <viswanath.puttagunta at linaro.org> wrote: > > On 25 November 2014 at 10:11, Viswanath Puttagunta > <viswanath.puttagunta at linaro.org> wrote: > > > On 25 November 2014 at 09:39, Jonathan Lennox <jonathan at

Patch cleaning up Opus x86 intrinsics configury

2015 Mar 03

Patch cleaning up Opus x86 intrinsics configury

Hello Jonathan, I am unable to apply your patch cleanly on tip. Timothy/opus-dev, This patch has some conflicts with my ARM patch that does fft optimizations http://lists.xiph.org/pipermail/opus/2015-March/002904.html http://lists.xiph.org/pipermail/opus/2015-March/002905.html One of us probably has to rebase depending on which patch goes into opus first. Regards, Vish On 1 March 2015 at

[Aarch64 00/11] Patches to enable Aarch64

2015 Nov 13

[Aarch64 00/11] Patches to enable Aarch64

> On Nov 13, 2015, at 1:51 PM, John Ridges <jridges at masque.com> wrote: > > Hi Jonathan, > > I'm sorry to bring this up again, and I don't want to beat a dead horse, but I was very surprised by your benchmarks so I took a little closer look. > > I think what's happening is that it's a little unfair to compare the ARM64 inline assembly to the C code,

[RFC V3 7/8] armv7, armv8: Optimize fixed point fft using NE10 library

2015 May 15

[RFC V3 7/8] armv7, armv8: Optimize fixed point fft using NE10 library

Uses NEON optimized fixed point fft routines in NE10 library Signed-off-by: Viswanath Puttagunta <viswanath.puttagunta at linaro.org> Signed-off-by: Jonathan Lennox <jonathan at vidyo.com> --- Makefile.am | 12 +- celt/arm/arm_celt_map.c | 46 ++-- celt/arm/celt_ne10_fft.c | 98 +++++---- celt/arm/fft_arm.h |

similar to: [Aarch64 00/11] Patches to enable Aarch64