Badouh, Asaf
2013-Feb-13 16:04 UTC
[LLVMdev] Using MSVC _ftol2 runtime function for fptoui on Win32
Hi Joe & Michael, In rev. 151382 you have changed the fptoui implementation of the x86 codegen for win32. Before the change fptoui was lowered to flds 16(%esp) fisttpll 8(%esp) movl 8(%esp), %eax After the change fptoui is lowered to flds 40(%esp) calll _ftol2 Please note that the assumption that _ftol2 doesn't modify ECX isn't true on sandybridge platform. Could you share with me the reasons behind this change? Did you get better performance after this change? Thanks, Asaf --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130213/b3346fa5/attachment.html>
Joe Groff
2013-Feb-13 21:48 UTC
[LLVMdev] Using MSVC _ftol2 runtime function for fptoui on Win32
On Wed, Feb 13, 2013 at 8:04 AM, Badouh, Asaf <asaf.badouh at intel.com> wrote:> Hi Joe & Michael,**** > > ** ** > > In rev. 151382 you have changed the fptoui implementation of the x86 > codegen for win32.**** > > ** ** > > Before the change fptoui was lowered to **** > > flds 16(%esp)**** > > fisttpll 8(%esp)**** > > movl 8(%esp), %eax**** > > ** ** > > After the change fptoui is lowered to **** > > flds 40(%esp)**** > > calll _ftol2**** > > ** ** > > Please note that the assumption that _ftol2 doesn’t modify ECX isn’t true > on sandybridge platform.**** > > Could you share with me the reasons behind this change? Did you get better > performance after this change? **** > > ** >"fisttp" is only available with SSE3 or later, and before that change, if SSE3 was unavailable, legalization would lower fptoui and fptosi to a libgcc/compiler-rt call that does not exist on Win32, so the change was necessary for compatibility with MSVC. If SSE3 is enabled, it should use 'fisttp', and if _ftol2 clobbers ECX, the pseudo-instruction for it can be fixed to reflect that. Those would both be good things to fix. -Joe -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130213/2a8da333/attachment.html>
Seemingly Similar Threads
- [LLVMdev] Best way to interface with MSVC _ftol2 runtime function for fptoui?
- [LLVMdev] Best way to interface with MSVC _ftol2 runtime function for fptoui?
- [LLVMdev] Best way to interface with MSVC _ftol2 runtime function for fptoui?
- Ref Classes: bug with using '.self' within initialize methods?
- [LLVMdev] Best way to interface with MSVC _ftol2 runtime function for fptoui?