Radford Neal
2015-Aug-21 19:35 UTC
[Rd] Getting SSE2 instructions to work in 32-bit builds on Windows
When getting pqR to work on Windows, I've wanted for it to be able to use SSE2 instructions with 32-bit builds, for those 32-bit processors that have SS2 instructions (all of them from the Pentium 4 onwards). It seems that R Core 32-bit versions do not attempt this, instead using the 387 FPU for all floating-point arithmetic. This is sometimes slower than using SSE2 instructions, and also produces results that are not compliant with the IEEE floating point standard, and that are not reproducible - possibly changing after trivial, unrelated changes to R or to the C compiler used. Once can get the gcc used in Rtools to use SSE2 instructions by including the following compiler options: -m32 -msse2 -mfpmath=sse Unfortunately, the result is that some things then crash. The problem is that by default gcc assumes that the stack is aligned to a 16-byte boundary on entry to a procedure, which allows it to easily ensure the 16-byte alignment needed for SSE2 instructions. Unfortunately, Windows does not ensure that a 32-bit application's stack is 16-byte aligned, so this doesn't work. (There's no problem for 64-bit builds, however.) A solution is to add one more option: -m32 -msse2 -mfpmath=sse -mstackrealign The -mstackrealign option forces gcc to generate code to align the stack on procedure entry, rather than assuming it is already aligned. It would probably be enough to compile only a few modules with this option (ones that are directly called from outside R), and hence avoid most of the extra procedure call overhead, but I haven't attempted this. Radford Neal
Apparently Analagous Threads
- Problems building R 2.2.1 with libgoto and SSE2 enabled
- [PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
- [PATCHv2] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
- [PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
- Problems with installing R packages from source and running C++ in R, even on fresh R installation