Jonathan Ragan-Kelley
2011-Dec-28 18:42 UTC
[LLVMdev] Codegen for vector float->double cast fails on x86 above SSE3
I've isolated a bug in SSE codegen to the attached example. define void @f(<2 x float>* %in, <2 x double>* %out) { entry: %0 = load <2 x float>* %in, align 8 %1 = fpext <2 x float> %0 to <2 x double> store <2 x double> %1, <2 x double>* %out, align 1 ret void } The code should load a <2 x float> vector from %in, fpext cast it to a <2 x double>, and do an unaligned store (movupd) of the result to %out. This works as expected on earlier SSE targets, generating this with llc -mcpu=core2: movss (%rdi), %xmm1 movss 4(%rdi), %xmm0 cvtss2sd %xmm0, %xmm0 cvtss2sd %xmm1, %xmm1 unpcklpd %xmm0, %xmm1 ## xmm1 = xmm1[0],xmm0[0] movupd %xmm1, (%rsi) ret Load both, cast float to double (cvtss2sd), pack vectors, and store. But with llc -mcpu=penryn or greater, it yields nonsense: movq (%rdi), %xmm0 pshufd $16, %xmm0, %xmm0 ## xmm0 = xmm0[0,0,1,0] movdqu %xmm0, (%rsi) ret -------------- next part -------------- A non-text attachment was scrubbed... Name: vec_cast.ll Type: application/octet-stream Size: 406 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111228/3a33b948/attachment.obj> -------------- next part -------------- A non-text attachment was scrubbed... Name: vec_cast.sse3.s Type: application/octet-stream Size: 368 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111228/3a33b948/attachment-0001.obj> -------------- next part -------------- A non-text attachment was scrubbed... Name: vec_cast.sse4.s Type: application/octet-stream Size: 303 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111228/3a33b948/attachment-0002.obj>
Jakob Stoklund Olesen
2011-Dec-28 20:36 UTC
[LLVMdev] Codegen for vector float->double cast fails on x86 above SSE3
On Dec 28, 2011, at 10:42 AM, Jonathan Ragan-Kelley wrote:> I've isolated a bug in SSE codegen to the attached example.Hi Jonathan, Great bugreport! Please file it in Bugzilla: http://llvm.org/bugs/ /jakob -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111228/18afba7f/attachment.html>
Maybe Matching Threads
- [LLVMdev] llvm.x86.sse2.sqrt.pd not using sqrtpd, calling a function that modifies ECX
- [LLVMdev] Enabling the SLP vectorizer by default for -O3
- [LLVMdev] Enabling the SLP vectorizer by default for -O3
- avx512 JIT backend generates wrong code on <4 x float>
- avx512 JIT backend generates wrong code on <4 x float>