Hey guys, I found a performance regression in the X86 backend related to PR10884. In trunk, the frame pointer is always set up when an AVX register is used in a function. This is done in case 32-byte spill code is later introduced into the function and hence dynamic stack realignment is needed. Needless to say, it's a big hammer. This regression seems particularly painful in small-to-medium sized routines that are called frequently in some codes. Is this issue already known? Is there a plan to fix this regression? If not, does anyone have a suggestion on the best way to remedy this issue? I have attached the IR and C code for a trivial test which exhibits the problem. The IR was produced by clang-421.0.60. Thanks, Cameron -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121022/ecda41c8/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: test.ll Type: application/octet-stream Size: 1486 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121022/ecda41c8/attachment.obj> -------------- next part -------------- A non-text attachment was scrubbed... Name: test.c Type: text/x-csrc Size: 136 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121022/ecda41c8/attachment.c>
> In trunk, the frame pointer is always set up when an AVX register is used in > a function. This is done in case 32-byte spill code is later introduced into > the function and hence dynamic stack realignment is needed. Needless to say, > it's a big hammer. This regression seems particularly painful in > small-to-medium sized routines that are called frequently in some codes. > > > Is this issue already known? Is there a plan to fix this regression? If not, > does anyone have a suggestion on the best way to remedy this issue? >You'd need to change the default stack alignment of the platform to deal with it effectively. -eric
On Mon, Oct 22, 2012 at 5:49 PM, Eric Christopher <echristo at gmail.com>wrote:> > In trunk, the frame pointer is always set up when an AVX register is > used in > > a function. This is done in case 32-byte spill code is later introduced > into > > the function and hence dynamic stack realignment is needed. Needless to > say, > > it's a big hammer. This regression seems particularly painful in > > small-to-medium sized routines that are called frequently in some codes. > > > > > > Is this issue already known? Is there a plan to fix this regression? If > not, > > does anyone have a suggestion on the best way to remedy this issue? > > > > You'd need to change the default stack alignment of the platform to > deal with it effectively.Hey Eric, Thanks for replying so quickly. Would you elaborate on this further? It seems costly to change the default stack alignment on the platform, since that would require recompiling all of the system and user libraries to also adhere to 32-byte stack alignment. Depending on an alignment not specified by the ABI would also limit our compiler's interoperability with other compilers installed on the system. I suppose that the stack could be aligned dynamically at main(...) and other visible entry points, but that too seems costly compared to the current M.O.. Maybe I do not fully understand all the issues involved, but I suppose I should be able to dynamically align the stack only when AVX registers are spilled in a function, right? Seems reasonable with my limited knowledge. Do you have any intuition built? It could be possible that the prologue/epilogue emitters run prior to the spilling decisions. I am not so sure of the ordering here. Also, and this might be asking a lot, but do you have any insight into why this behaviour changed sometime around the LLVM 3.0 release? I have not been able to find much history. Thanks again, Cameron -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20121023/ec7efb27/attachment.html>
Eric Christopher <echristo at gmail.com> writes:>> Is this issue already known? Is there a plan to fix this regression? If not, >> does anyone have a suggestion on the best way to remedy this issue? >> > > You'd need to change the default stack alignment of the platform to > deal with it effectively.That's not possible since such code will have to interface with externally-compiled libraries which won't have the same alignment assumptions. -David