Cameron McInally
2012-Mar-02 16:58 UTC
[LLVMdev] Stack alignment on X86 AVX seems incorrect
On Fri, Mar 2, 2012 at 11:32 AM, Evandro Menezes <emenezes at codeaurora.org> wrote: ...> Figure 3.3 on page 16 of www.x86-64.org/documentation/abi.pdf is not > normative. See foot note 7 in the same page. Figure 3.4 on page 21 > confirms that the use of a frame-pointer is optional. > > So, if one doesn't use ENTER in the prologue and uses RSP to access local > variables, RBP may be used as a calee-saved GPR.I am not sure if I am completely following. The issue that required aligning the frame to 32 bytes is when there are variable sized objects on the stack (e.g. alloca). In that case, the RBP frame pointer is required to access the spill slots. If I'm not mistaken, calculating the address of spill slots off of RSP would be costly in this case. Are you suggesting that there is a way to base spill slots off of RSP when the stack size is unknown at compile time? This does bring up an interesting idea though. If we wanted to punt, it would be possible to check for variable sized objects on the stack and then only issue unaligned moves for 256b spills/reloads. Not ideal for performance, but it would work as a stopgap. -Cameron -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120302/b604f045/attachment.html>
Joerg Sonnenberger
2012-Mar-02 17:16 UTC
[LLVMdev] Stack alignment on X86 AVX seems incorrect
On Fri, Mar 02, 2012 at 11:58:29AM -0500, Cameron McInally wrote:> On Fri, Mar 2, 2012 at 11:32 AM, Evandro Menezes <emenezes at codeaurora.org> > wrote: > ... > > Figure 3.3 on page 16 of www.x86-64.org/documentation/abi.pdf is not > > normative. See foot note 7 in the same page. Figure 3.4 on page 21 > > confirms that the use of a frame-pointer is optional. > > > > So, if one doesn't use ENTER in the prologue and uses RSP to access local > > variables, RBP may be used as a calee-saved GPR. > > I am not sure if I am completely following. The issue that required > aligning the frame to 32 bytes is when there are variable sized objects on > the stack (e.g. alloca). In that case, the RBP frame pointer is required to > access the spill slots. If I'm not mistaken, calculating the address of > spill slots off of RSP would be costly in this case.No, stack realignment needs to happen if there are auto variables on the stack of types that need a larger alignment than the default. This currently means AVX vectors for x86-64 and SSE/AVX vectors for x86-32 folloing the original sysv ABI. In that case %rbp/%ebp is used to reference the original arguments on the stack and %rsp/%esp is used to reference the auto variables. This doesn't work though if dynamic allocas exist, so either stack variables with larger alignment need to be turned into / remain as dynamic allocas OR another register is needed to replace %rsp/%esp in the above.> This does bring up an interesting idea though. If we wanted to punt, it > would be possible to check for variable sized objects on the stack and then > only issue unaligned moves for 256b spills/reloads. Not ideal for > performance, but it would work as a stopgap.The problem is worse on x86-32 following the original sysv ABI. In that case both GCC and LLVM currently just create broken code if a function uses both SSE instructions and alloca. Joerg
On Mar 2, 2012, at 9:16 AM, Joerg Sonnenberger <joerg at britannica.bec.de> wrote:> On Fri, Mar 02, 2012 at 11:58:29AM -0500, Cameron McInally wrote: >> On Fri, Mar 2, 2012 at 11:32 AM, Evandro Menezes <emenezes at codeaurora.org> >> wrote: >> ... >>> Figure 3.3 on page 16 of www.x86-64.org/documentation/abi.pdf is not >>> normative. See foot note 7 in the same page. Figure 3.4 on page 21 >>> confirms that the use of a frame-pointer is optional. >>> >>> So, if one doesn't use ENTER in the prologue and uses RSP to access local >>> variables, RBP may be used as a calee-saved GPR. >> >> I am not sure if I am completely following. The issue that required >> aligning the frame to 32 bytes is when there are variable sized objects on >> the stack (e.g. alloca). In that case, the RBP frame pointer is required to >> access the spill slots. If I'm not mistaken, calculating the address of >> spill slots off of RSP would be costly in this case. > > No, stack realignment needs to happen if there are auto variables on the > stack of types that need a larger alignment than the default. This > currently means AVX vectors for x86-64 and SSE/AVX vectors for x86-32 > folloing the original sysv ABI. In that case %rbp/%ebp is used to > reference the original arguments on the stack and %rsp/%esp is used to > reference the auto variables. > > This doesn't work though if dynamic allocas exist, so either stack > variables with larger alignment need to be turned into / remain as > dynamic allocas OR another register is needed to replace %rsp/%esp > in the above. >Exactly right.>> This does bring up an interesting idea though. If we wanted to punt, it >> would be possible to check for variable sized objects on the stack and then >> only issue unaligned moves for 256b spills/reloads. Not ideal for >> performance, but it would work as a stopgap. > > The problem is worse on x86-32 following the original sysv ABI. In that > case both GCC and LLVM currently just create broken code if a function > uses both SSE instructions and alloca. > > Joerg > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Evandro Menezes
2012-Mar-02 21:38 UTC
[LLVMdev] Stack alignment on X86 AVX seems incorrect
Cameron, I was the one not completely following you. I missed the detail about variable-sized variables on the stack. -- Evandro Menezes Austin, TX emenezes at codeaurora.org Qualcomm Innovation Center, Inc is a member of the Code Aurora Forum On 03/02/12 10:58, Cameron McInally wrote:> On Fri, Mar 2, 2012 at 11:32 AM, Evandro Menezes > <emenezes at codeaurora.org <mailto:emenezes at codeaurora.org>> wrote: > ... > > Figure 3.3 on page 16 of www.x86-64.org/documentation/abi.pdf > <http://www.x86-64.org/documentation/abi.pdf> is not > > normative. See foot note 7 in the same page. Figure 3.4 on page 21 > > confirms that the use of a frame-pointer is optional. > > > > So, if one doesn't use ENTER in the prologue and uses RSP to access local > > variables, RBP may be used as a calee-saved GPR. > > I am not sure if I am completely following. The issue that required > aligning the frame to 32 bytes is when there are variable sized objects > on the stack (e.g. alloca). In that case, the RBP frame pointer is > required to access the spill slots. If I'm not mistaken, calculating the > address of spill slots off of RSP would be costly in this case. > > Are you suggesting that there is a way to base spill slots off of RSP > when the stack size is unknown at compile time? > > This does bring up an interesting idea though. If we wanted to punt, it > would be possible to check for variable sized objects on the stack and > then only issue unaligned moves for 256b spills/reloads. Not ideal for > performance, but it would work as a stopgap. > > -Cameron
Maybe Matching Threads
- [LLVMdev] Stack alignment on X86 AVX seems incorrect
- [LLVMdev] Stack alignment on X86 AVX seems incorrect
- [LLVMdev] Stack alignment on X86 AVX seems incorrect
- [LLVMdev] Stack alignment on X86 AVX seems incorrect
- [LLVMdev] Stack alignment on X86 AVX seems incorrect