Hi all, It looks like vector spills don't use aligned moves even though the stack is aligned. This seems like an optimization opportunity. The attached replacement of fibonacci.cpp generates x86 code like this: 03A70010 push ebp 03A70011 mov ebp,esp 03A70013 and esp,0FFFFFFF0h 03A70019 sub esp,1A0h ... 03A7006C movups xmmword ptr [esp+180h],xmm7 ... 03A70229 mulps xmm1,xmmword ptr [esp+180h] ... 03A70682 movups xmm0,xmmword ptr [esp+180h] Note how stores and loads use unaligned moves while it could use aligned moves. It's also interesting that the multiply does correctly assume the stack to be 16-byte aligned. Is there something I'm doing wrong (again), or is this already known? Thanks a lot, Nicolas Capens -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080714/2854bfae/attachment.html> -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: fibonacci.cpp URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080714/2854bfae/attachment.ksh>
This is on Windows / Cygwin? I think the dynamic stack pointer re- alignment doesn't happen until post- register allocation. Assuming there aren't other instructions between the prologue and the first movups that mess up esp (there shouldn't), this is indeed a bug. Please file a bug and attach a bc file. Thanks. Evan On Jul 14, 2008, at 7:43 AM, Nicolas Capens wrote:> Hi all, > > It looks like vector spills don’t use aligned moves even though the > stack is aligned. This seems like an optimization opportunity. > > The attached replacement of fibonacci.cpp generates x86 code like > this: > > 03A70010 push ebp > 03A70011 mov ebp,esp > 03A70013 and esp,0FFFFFFF0h > 03A70019 sub esp,1A0h > ... > 03A7006C movups xmmword ptr [esp+180h],xmm7 > ... > 03A70229 mulps xmm1,xmmword ptr [esp+180h] > ... > 03A70682 movups xmm0,xmmword ptr [esp+180h] > > Note how stores and loads use unaligned moves while it could use > aligned moves. It’s also interesting that the multiply does > correctly assume the stack to be 16-byte aligned. > > Is there something I’m doing wrong (again), or is this already known? > > Thanks a lot, > > Nicolas Capens > > <fibonacci.cpp> > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080714/4a62272b/attachment.html>
On Jul 14, 2008, at 7:43 AM, Nicolas Capens wrote:> Hi all, > > It looks like vector spills don’t use aligned moves even though the > stack is aligned. This seems like an optimization opportunity.What target is this? Linux doesn't have a 16-byte aligned stack. -Chris> > The attached replacement of fibonacci.cpp generates x86 code like > this: > > 03A70010 push ebp > 03A70011 mov ebp,esp > 03A70013 and esp,0FFFFFFF0h > 03A70019 sub esp,1A0h > ... > 03A7006C movups xmmword ptr [esp+180h],xmm7 > ... > 03A70229 mulps xmm1,xmmword ptr [esp+180h] > ... > 03A70682 movups xmm0,xmmword ptr [esp+180h] > > Note how stores and loads use unaligned moves while it could use > aligned moves. It’s also interesting that the multiply does > correctly assume the stack to be 16-byte aligned. > > Is there something I’m doing wrong (again), or is this already known? > > Thanks a lot, > > Nicolas Capens > > <fibonacci.cpp> > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080714/42b5ecb0/attachment.html>
On Jul 14, 2008, at 11:37 AMPDT, Chris Lattner wrote:> > On Jul 14, 2008, at 7:43 AM, Nicolas Capens wrote: > >> Hi all, >> >> It looks like vector spills don’t use aligned moves even though the >> stack is aligned. This seems like an optimization opportunity. > > What target is this? Linux doesn't have a 16-byte aligned stack. > > -ChrisIt does now:) Anton (I think) implemented dynamic stack alignment for functions with XMM variables. Note the "and esp" in the fragment below.>> The attached replacement of fibonacci.cpp generates x86 code like >> this: >> >> 03A70010 push ebp >> 03A70011 mov ebp,esp >> 03A70013 and esp,0FFFFFFF0h >> 03A70019 sub esp,1A0h >> ... >> 03A7006C movups xmmword ptr [esp+180h],xmm7 >> ... >> 03A70229 mulps xmm1,xmmword ptr [esp+180h] >> ... >> 03A70682 movups xmm0,xmmword ptr [esp+180h] >> >> Note how stores and loads use unaligned moves while it could use >> aligned moves. It’s also interesting that the multiply does >> correctly assume the stack to be 16-byte aligned. >> >> Is there something I’m doing wrong (again), or is this already known? >> >> Thanks a lot, >> >> Nicolas Capens >> >> <fibonacci.cpp> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080714/90b92015/attachment.html>
This is on Windows / MSVC++ 2005. I'll file a bug. From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Evan Cheng Sent: Monday, 14 July, 2008 20:35 To: LLVM Developers Mailing List Cc: anton at korobeynikov.info Subject: Re: [LLVMdev] Spilled variables using unaligned moves This is on Windows / Cygwin? I think the dynamic stack pointer re-alignment doesn't happen until post- register allocation. Assuming there aren't other instructions between the prologue and the first movups that mess up esp (there shouldn't), this is indeed a bug. Please file a bug and attach a bc file. Thanks. Evan On Jul 14, 2008, at 7:43 AM, Nicolas Capens wrote: Hi all, It looks like vector spills don't use aligned moves even though the stack is aligned. This seems like an optimization opportunity. The attached replacement of fibonacci.cpp generates x86 code like this: 03A70010 push ebp 03A70011 mov ebp,esp 03A70013 and esp,0FFFFFFF0h 03A70019 sub esp,1A0h ... 03A7006C movups xmmword ptr [esp+180h],xmm7 ... 03A70229 mulps xmm1,xmmword ptr [esp+180h] ... 03A70682 movups xmm0,xmmword ptr [esp+180h] Note how stores and loads use unaligned moves while it could use aligned moves. It's also interesting that the multiply does correctly assume the stack to be 16-byte aligned. Is there something I'm doing wrong (again), or is this already known? Thanks a lot, Nicolas Capens <fibonacci.cpp> _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080714/ace0cca1/attachment.html>
Hi Evan, Could you maybe point me to the source files where this issue might originate? I'd like to learn more about LLVM's innards but so far I've just scraped the surface and I don't know where what phase of instruction selection / register allocation / stack layout / etc. happens. If I understand correctly this issue might be fixed by moving stack pointer alignment before register allocation? Is this something that might be reasonably straightforward or are there complicated dependencies involved? Thanks again, Nicolas From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Evan Cheng Sent: Monday, 14 July, 2008 20:35 To: LLVM Developers Mailing List Cc: anton at korobeynikov.info Subject: Re: [LLVMdev] Spilled variables using unaligned moves This is on Windows / Cygwin? I think the dynamic stack pointer re-alignment doesn't happen until post- register allocation. Assuming there aren't other instructions between the prologue and the first movups that mess up esp (there shouldn't), this is indeed a bug. Please file a bug and attach a bc file. Thanks. Evan On Jul 14, 2008, at 7:43 AM, Nicolas Capens wrote: Hi all, It looks like vector spills don't use aligned moves even though the stack is aligned. This seems like an optimization opportunity. The attached replacement of fibonacci.cpp generates x86 code like this: 03A70010 push ebp 03A70011 mov ebp,esp 03A70013 and esp,0FFFFFFF0h 03A70019 sub esp,1A0h ... 03A7006C movups xmmword ptr [esp+180h],xmm7 ... 03A70229 mulps xmm1,xmmword ptr [esp+180h] ... 03A70682 movups xmm0,xmmword ptr [esp+180h] Note how stores and loads use unaligned moves while it could use aligned moves. It's also interesting that the multiply does correctly assume the stack to be 16-byte aligned. Is there something I'm doing wrong (again), or is this already known? Thanks a lot, Nicolas Capens <fibonacci.cpp> _______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080715/e1f55b84/attachment.html>