Michael Spencer
2011-Apr-14 19:16 UTC
[LLVMdev] [x86 codegen] 3DNow! intrinsics not behaving as expected.
I finally got all of the 3DNow! instruction intrinsics and builtins
into LLVM and Clang, however, while testing them, I've noticed that
they produce incorrect results.
For example:
typedef float V2f __attribute__((vector_size(8)));
int main() {
V2f dest, a = {1.0, 3.0}, b = {10.0, 3.5};
dest = __builtin_ia32_pfadd(a, b);
printf("(%f, %f)\n", dest[0], dest[1]);
}
Should output (11, 6.5). However, it outputs different values
depending on the optimization level. Generally one of them is correct,
and the other is -nan.
I looked at the program using a debugger, and the pfadd instruction is
executed correctly and the MMX register contains the correct values.
The code that prepares the stack for the printf call seems to be
messing it up.
Here's the assembly generated at O3 for the above:
.file "intrin.c"
.text
.globl main
.align 16, 0x90
.type main, at function
main: # @main
# BB#0: # %entry
pushl %ebp
movl %esp, %ebp
subl $56, %esp
movl $1077936128, -12(%ebp) # imm = 0x40400000
movl $1065353216, -16(%ebp) # imm = 0x3F800000
movl $1080033280, -4(%ebp) # imm = 0x40600000
movl $1092616192, -8(%ebp) # imm = 0x41200000
movq -16(%ebp), %mm0
pfadd -8(%ebp), %mm0
movq %mm0, -24(%ebp)
flds -20(%ebp)
fstpl 12(%esp)
flds -24(%ebp)
fstpl 4(%esp)
movl $.L.str, (%esp)
calll printf
xorl %eax, %eax
addl $56, %esp
popl %ebp
ret
.Ltmp0:
.size main, .Ltmp0-main
.type .L.str, at object # @.str
.section .rodata.str1.1,"aMS", at progbits,1
.L.str:
.asciz "%f, %f\n"
.size .L.str, 8
.section ".note.GNU-stack","", at progbits
Attached are my patches to enable support for this. I'd like to be
done with this, because 3DNow! isn't even supported anymore. I was
just adding these to learn tblgen and fill in some of the MSVC
intrinsic headers.
- Michael Spencer
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 3dnow-builtins.patch
Type: application/octet-stream
Size: 23790 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110414/bdbdf8e4/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: clang-3dnow-builtins.patch
Type: application/octet-stream
Size: 8274 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110414/bdbdf8e4/attachment-0001.obj>
Eli Friedman
2011-Apr-14 19:47 UTC
[LLVMdev] [x86 codegen] 3DNow! intrinsics not behaving as expected.
On Thu, Apr 14, 2011 at 12:16 PM, Michael Spencer <bigcheesegs at gmail.com> wrote:> I finally got all of the 3DNow! instruction intrinsics and builtins > into LLVM and Clang, however, while testing them, I've noticed that > they produce incorrect results. > > For example: > > typedef float V2f __attribute__((vector_size(8))); > > int main() { > V2f dest, a = {1.0, 3.0}, b = {10.0, 3.5}; > dest = __builtin_ia32_pfadd(a, b); > printf("(%f, %f)\n", dest[0], dest[1]); > } > > Should output (11, 6.5). However, it outputs different values > depending on the optimization level. Generally one of them is correct, > and the other is -nan. > > I looked at the program using a debugger, and the pfadd instruction is > executed correctly and the MMX register contains the correct values. > The code that prepares the stack for the printf call seems to be > messing it up.I would call that "user error"; basically, using MMX instructions messes up the FP stack, and we assume the user is smart enough to make sure the two don't mix. -Eli
Chris Lattner
2011-Apr-14 21:37 UTC
[LLVMdev] [x86 codegen] 3DNow! intrinsics not behaving as expected.
On Apr 14, 2011, at 12:47 PM, Eli Friedman wrote:>> I looked at the program using a debugger, and the pfadd instruction is >> executed correctly and the MMX register contains the correct values. >> The code that prepares the stack for the printf call seems to be >> messing it up. > > I would call that "user error"; basically, using MMX instructions > messes up the FP stack, and we assume the user is smart enough to make > sure the two don't mix.More specifically, if you use MMX/3dNow intrinsics, you have to call "emms" at ABI boundaries. -Chris
Apparently Analagous Threads
- [LLVMdev] [x86 codegen] 3DNow! intrinsics not behaving as expected.
- [LLVMdev] [x86 codegen] 3DNow! intrinsics not behaving as expected.
- [LLVMdev] [x86 codegen] 3DNow! intrinsics not behaving as expected.
- Ref Classes: bug with using '.self' within initialize methods?
- MMX loop filter for theora-exp