Hi, We have implemented the rsqrt instruction generation for X86 target architecture. We have introduced a flag -fp-rsqrt flag which controls the generatation of X86 rsqrt instruction generation. We have observed minor effects on precision due to rsqrt and hence has put these transformations under the mentioned flag. Note that -fp-rsqrt is only enabled with -enable-unsafe-fp-math flag presently. Moreover we have achieved some derived optimizations along with rsqrt generations. Following is the details of the -fp-rsqrt flag along with its values and enabled optimizations. -fp-rsqrt =off - No rsqrt =on - y/sqrt(x) => y * rsqrt(x) // Standard =advance - Standard, sqrt(x) => x * rsqrt(x) // Advance =fda - Advance, Derive FMA i.e. y/sqrt(x) +z => y * rsqrt(x) + z => vfmaddss y rsqrt(x) z. This is termed as FDA(Fused Division Accumulation) Sending the code patch(onto the svn revision 167927), text description and testcases attached with this mail. Also we want to commit these changes back to llvm codebase. Please review and suggest. Future enhance plans are as follows. TODO: 1. Enable vector rsqrt generation. 2. Generate different variations of FDA i.e. FMSUB, FNMSUB,FNMADD instruction generations as required. Best Regards, soham "The search for truth is more precious than its possession." -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121115/3f05bfde/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: rsqrt_167927.patch Type: application/octet-stream Size: 15452 bytes Desc: rsqrt_167927.patch URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121115/3f05bfde/attachment.obj> -------------- next part -------------- A non-text attachment was scrubbed... Name: rsqrt-advance.ll Type: application/octet-stream Size: 667 bytes Desc: rsqrt-advance.ll URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121115/3f05bfde/attachment-0001.obj> -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: rsqrt-description.txt URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121115/3f05bfde/attachment.txt> -------------- next part -------------- A non-text attachment was scrubbed... Name: rsqrt-fda.ll Type: application/octet-stream Size: 984 bytes Desc: rsqrt-fda.ll URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121115/3f05bfde/attachment-0002.obj> -------------- next part -------------- A non-text attachment was scrubbed... Name: rsqrt-on.ll Type: application/octet-stream Size: 718 bytes Desc: rsqrt-on.ll URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121115/3f05bfde/attachment-0003.obj>
On Wed, Nov 14, 2012 at 10:43 PM, Chakraborty, Soham <Soham.Chakraborty at amd.com> wrote:> Hi, > > > > We have implemented the rsqrt instruction generation for X86 target > architecture. We have introduced a flag -fp-rsqrt flag which controls the > generatation of X86 rsqrt instruction generation. > > We have observed minor effects on precision due to rsqrt and hence has put > these transformations under the mentioned flag. > > Note that –fp-rsqrt is only enabled with -enable-unsafe-fp-math flag > presently. > > Moreover we have achieved some derived optimizations along with rsqrt > generations. > > Following is the details of the -fp-rsqrt flag along with its values and > enabled optimizations.We already have a way to indicate the expected accuracy of floating-point operations; see http://llvm.org/docs/LangRef.html#fpmath . Please use that rather than adding more options; floating-point math is complicated enough without adding unnecessary options. We also already have ways to tell CodeGen to form FMA instructions; if those aren't working for you, the answer is to fix them, not to add special-case for the case where one of the operands happens to be an rsqrt. The code you're adding isn't in the right place; it should probably be a target-specific DAGCombine, somewhere in X86ISelLowering.cpp. Also, please take the time to read http://llvm.org/docs/CodingStandards.html . -Eli
Hi, Please find attached the modified patch and description. We have modified and retested the patch taking into consideration the comments and inputs provided earlier. Thanks & Regards, soham -----Original Message----- From: Eli Friedman [mailto:eli.friedman at gmail.com] Sent: Thursday, November 15, 2012 12:59 PM To: Chakraborty, Soham Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] X86 rsqrt instruction generated On Wed, Nov 14, 2012 at 10:43 PM, Chakraborty, Soham <Soham.Chakraborty at amd.com> wrote:> Hi, > > > > We have implemented the rsqrt instruction generation for X86 target > architecture. We have introduced a flag -fp-rsqrt flag which controls > the generatation of X86 rsqrt instruction generation. > > We have observed minor effects on precision due to rsqrt and hence has > put these transformations under the mentioned flag. > > Note that -fp-rsqrt is only enabled with -enable-unsafe-fp-math flag > presently. > > Moreover we have achieved some derived optimizations along with rsqrt > generations. > > Following is the details of the -fp-rsqrt flag along with its values > and enabled optimizations.We already have a way to indicate the expected accuracy of floating-point operations; see http://llvm.org/docs/LangRef.html#fpmath . Please use that rather than adding more options; floating-point math is complicated enough without adding unnecessary options. We also already have ways to tell CodeGen to form FMA instructions; if those aren't working for you, the answer is to fix them, not to add special-case for the case where one of the operands happens to be an rsqrt. The code you're adding isn't in the right place; it should probably be a target-specific DAGCombine, somewhere in X86ISelLowering.cpp. Also, please take the time to read http://llvm.org/docs/CodingStandards.html . -Eli -------------- next part -------------- A non-text attachment was scrubbed... Name: rsqrt_modified_167927.patch Type: application/octet-stream Size: 12306 bytes Desc: rsqrt_modified_167927.patch URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121203/69610942/attachment.obj> -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: rsqrt-description.txt URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121203/69610942/attachment.txt>