I have a test case(attached as fc_long.ll) that when run through the optimizer produces 65bit integer math(fc_long-opt.ll). Now I understand that llvm can have any length integer, but I consider turning a 64bit mul into multiple 65 bit instructions to be a 'bad' optimization. This eventually expands to a 128bit multiply call(__multi3) which I have absolutely no interest in supporting. So I'm wondering what optimization might be the culprit here so I can disable it in this situation. Micah -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20091012/e646bec7/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: fc_long.ll Type: application/octet-stream Size: 3302 bytes Desc: fc_long.ll URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20091012/e646bec7/attachment.obj> -------------- next part -------------- A non-text attachment was scrubbed... Name: fc_long-opt.ll Type: application/octet-stream Size: 2274 bytes Desc: fc_long-opt.ll URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20091012/e646bec7/attachment-0001.obj>
On Mon, Oct 12, 2009 at 6:15 PM, Villmow, Micah <Micah.Villmow at amd.com> wrote:> I have a test case(attached as fc_long.ll) that when run through the > optimizer produces 65bit integer math(fc_long-opt.ll). > > > > Now I understand that llvm can have any length integer, but I consider > turning a 64bit mul into multiple 65 bit instructions to be a ‘bad’ > optimization. This eventually expands to a 128bit multiply call(__multi3) > which I have absolutely no interest in supporting. So I’m wondering what > optimization might be the culprit here so I can disable it in this > situation.I'm pretty sure this is the fault of one of the loop passes (I forget which one off the top of my head). It's worth noting that the optimizer manages to eliminate the loop; computing the exit value of one of the loop variables uses the wide multiply. -Eli
On Mon, Oct 12, 2009 at 8:22 PM, Eli Friedman <eli.friedman at gmail.com> wrote:> On Mon, Oct 12, 2009 at 6:15 PM, Villmow, Micah <Micah.Villmow at amd.com> wrote: >> I have a test case(attached as fc_long.ll) that when run through the >> optimizer produces 65bit integer math(fc_long-opt.ll). >> >> >> >> Now I understand that llvm can have any length integer, but I consider >> turning a 64bit mul into multiple 65 bit instructions to be a ‘bad’ >> optimization. This eventually expands to a 128bit multiply call(__multi3) >> which I have absolutely no interest in supporting. So I’m wondering what >> optimization might be the culprit here so I can disable it in this >> situation. > > I'm pretty sure this is the fault of one of the loop passes (I forget > which one off the top of my head). It's worth noting that the > optimizer manages to eliminate the loop; computing the exit value of > one of the loop variables uses the wide multiply.It's -indvars. I ran mem2reg, instcombine, and simplifycfg on your input and got the attached file. I thought I could add "nuw nsw" to the arithmetic inside the loop to say that i65 is unnecessary, but that doesn't cause indvars to leave the arithmetic at i64. Is that a bug in indvars, or in SCEV, or is this just a natural limitation of SCEV? -------------- next part -------------- A non-text attachment was scrubbed... Name: fc_long.ll Type: application/octet-stream Size: 1593 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20091012/932523ab/attachment.obj>
I haven't looked at your attached .ll, but I have added code to the loop analyser to make it do this. Given the choice between not analysing loop evolutions and growing arithmetic by an extra bit, it will expand the arithmetic. It's in SCEV, which is used by pretty much every loop optimizer. Nick Villmow, Micah wrote:> > > I have a test case(attached as fc_long.ll) that when run through the > optimizer produces 65bit integer math(fc_long-opt.ll). > > > > Now I understand that llvm can have any length integer, but I consider > turning a 64bit mul into multiple 65 bit instructions to be a ‘bad’ > optimization. This eventually expands to a 128bit multiply > call(__multi3) which I have absolutely no interest in supporting. So I’m > wondering what optimization might be the culprit here so I can disable > it in this situation. > > > > Micah > > > ------------------------------------------------------------------------ > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On Monday 12 October 2009 22:22, Eli Friedman wrote:> > Now I understand that llvm can have any length integer, but I consider > > turning a 64bit mul into multiple 65 bit instructions to be a ‘bad’ > > optimization. This eventually expands to a 128bit multiply call(__multi3) > > which I have absolutely no interest in supporting. So I’m wondering what > > optimization might be the culprit here so I can disable it in this > > situation. > > I'm pretty sure this is the fault of one of the loop passes (I forget > which one off the top of my head).It's SCEV. I have this very bit of code disabled in our tree for exactly the reasons Micah gives. Check out BinomialCoefficient is ScalarEvolution.cpp. Here's the big hack we put in, which is just the code that used to be there in 2.3: // We need at least W + T bits for the multiplication step // FIXME: A temporary hack; we round up the bitwidths // to the nearest power of 2 to be nice to the code generator. unsigned CalculationBits = 1U << Log2_32_Ceil(W + T); // Cray: Truncate to 64 bits. It's what we did with LLVM 2.3 and // while it can create miscompilations in some cases if the // computation happens to overflow, it's no worse than what we did // before. if (CalculationBits > 64) { //return new SCEVCouldNotCompute(); CalculationBits = 64; DEBUG(DOUT << "*** [dag] SEVERE WARNING: SCEV calculation truncated to 64 bits, check for miscompilation! ***\n"); } I have never found a test where this actually causes a problem. And we have tens of thousands of tests. It is very bad form to create wide arithmetic where none was there before. This is one of the downfalls of SCEV but I consider it to be minor compared to the benefits. -Dave