Displaying 2 results from an estimated 2 matches for "smlal2".
Did you mean:
small2
2015 Nov 23
1
[Aarch64 v2 05/18] Add Neon intrinsics for Silk noise shape quantization.
...n code is a performance boost for both platforms, and I?d rather not litter it with #ifdef?s unless there?s a large difference between the platforms.
It looks like Clang (the version in Xcode 7.1.1, at least) is smart enough to optimize the first two operations you mention, figuring out sshll2 and smlal2 properly, though the third causes a gratuitous extra ?ext.16b? to be generated. I?ve filed a missed-optimization bug on Clang for the latter.
Here?s the code it generates:
_silk_NSQ_noise_shape_feedback_loop_neon:
000000000000004c ldr w9, [x0]
0000000000000050 cmp w3, #8...
2015 Nov 20
2
[Aarch64 00/11] Patches to enable Aarch64
> On Nov 19, 2015, at 5:47 PM, John Ridges <jridges at masque.com> wrote:
>
> Any speedup from the intrinsics may just be swamped by the rest of the encode/decode process. But I think you really want SIG2WORD16 to be (vqmovns_s32(PSHR32((x), SIG_SHIFT)))
Yes, you?re right. I forgot to run the vectors under qemu with my previous version (oh, the embarrassment!) Fixed forthcoming