Displaying 4 results from an estimated 4 matches for "_mm_storel_epi64".
2018 May 24
0
X86 Intrinsics : _mm_storel_epi64/ _mm_loadl_epi64 with -m32
Hi,
I’m using _mm_storel_epi64/ _mm_loadl_epi64 in my test case as below
and generating 32-bit code (using -m32 and -msse4.2). The 64-bit load
and 64-bit store operations are replaced with two 32-bit mov
instructions, presumably due to the use of uint64_t type. If I use
__m128i instead of uint64_t everywhere, then the read and w...
2009 Jan 31
2
[LLVMdev] Optimized code analysis problems
...1, XMM2);
XMM1 = _mm_cmplt_epi32(XMM1, _mm_setzero_si128());
XMM1 = _mm_srli_epi32(XMM1, 31);
XMM4 = _mm_sub_epi32(XMM4, XMM1);
XMM4 = _mm_srl_epi32(XMM4, XMM5);
XMM3 = _mm_packs_epi32(XMM3, XMM4);
XMM3 = _mm_packus_epi16(XMM3, XMM3);
_mm_storel_epi64((__m128i*)(output+8*l), XMM3);
}
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20090130/01e86b9e/attachment.html>
2016 Jun 17
5
ARM NEON optimization -- celt_fir()
Hi all,
This is Linfeng Zhang from Google. I'll work on ARM NEON optimization in the
next few months.
I'm submitting 2 patches in the following couple of emails, which have the new
created celt_fir_neon().
I revised celt_fir_c() to not pass in argument "mem" in Patch 1. If there are
concerns to this change, please let me know.
Many thanks to your comments.
Linfeng Zhang
2016 Jul 14
6
Several patches of ARM NEON optimization
I rebased my previous 3 patches to the current master with minor changes.
Patches 1 to 3 replace all my previous submitted patches.
Patches 4 and 5 are new.
Thanks,
Linfeng Zhang