Hal Finkel
2013-Nov-21 20:39 UTC
[LLVMdev] Unaligned load/store for callee-saved 128-bit registers
----- Original Message -----> From: "Francois Pichet" <pichet2000 at gmail.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "Chad Rosier" <mcrosier at codeaurora.org>, "Jakob Stoklund Olesen" <jolesen at apple.com>, "LLVM Developers Mailing > List" <llvmdev at cs.uiuc.edu> > Sent: Thursday, November 21, 2013 2:36:06 PM > Subject: Re: [LLVMdev] Unaligned load/store for callee-saved 128-bit registers > > > BTW I managed to get around this problem by flagging all the 128-bit > registers as caller saved only. > > On my system, vector registers are more likely to be used on leaf > functions anyway. > >Sounds good; however, unless I'm wrong, this looks like a general problem that needs to be fixed. -Hal> > > > > On Thu, Nov 21, 2013 at 3:24 PM, Hal Finkel < hfinkel at anl.gov > > wrote: > > > > > ----- Original Message ----- > > From: "Hal Finkel" < hfinkel at anl.gov > > > To: "Francois Pichet" < pichet2000 at gmail.com > > > Cc: "LLVM Developers Mailing List" < llvmdev at cs.uiuc.edu > > > Sent: Monday, November 18, 2013 2:45:53 PM > > Subject: Re: [LLVMdev] Unaligned load/store for callee-saved > > 128-bit registers > > > > ----- Original Message ----- > > > From: "Francois Pichet" < pichet2000 at gmail.com > > > > To: "LLVM Developers Mailing List" < llvmdev at cs.uiuc.edu > > > > Sent: Monday, November 18, 2013 2:26:30 PM > > > Subject: [LLVMdev] Unaligned load/store for callee-saved 128-bit > > > registers > > > > > > > > > > > > On my (out-of-tree) target I have 16 128-bit registers. > > > Unaligned load/store are illegal. (must 16-bytes aligned) > > > > > > > > > > > > 8 of those registers are defined as callee-saved and 8 > > > caller-saved. > > > The default stack size is 4 bytes. > > > The target implements dynamic stack realign to make sure the > > > stack > > > will always be aligned correctly when necessary. > > > > > > > > > Yet I am still getting unaligned load/store when running this > > > test > > > case: http://pastie.org/8490604 > > > > > > > > > The problem is in PEI::calculateCalleeSavedRegisters: > > > > > > > > > > > > // We may not be able to satisfy the desired alignment > > > specification > > > of > > > // the TargetRegisterClass if the stack alignment is smaller. Use > > > the > > > // min. > > > Align = std::min(Align, StackAlign); > > > FrameIdx = MFI->CreateStackObject(RC->getSize(), Align, true); > > > > > > > > > This will create unaligned load/store for a callee-saved 128-bit > > > register on the frame slot because StackAlign is 4. > > > > > > > > > Adding a check for stack realignable or putting all the 128-bit > > > registers as caller-save will fix the problem. > > > > > > if (!TFI->isStackRealignable()) <--- new line > > > Align = std::min(Align, StackAlign); > > > > > > Is this a bug or am I missing something? > > > > > > > This looks like a bug. By default, isStackRealignable() always > > returns true (this default comes from the TargetFrameLowering > > constructor). I wonder, however, is this is not correctly > > implemented in some backends (X86RegisterInfo::canRealignStack, for > > example, is not completely trivial). Nadav, do you know how this > > works? > > [Trying some other relevant people...] > > Chad, Jakob: thoughts? > > -Hal > > > > > > > -Hal > > > > > > > > Thanks, > > > Francois Pichet, Octasic. > > > > > > _______________________________________________ > > > LLVM Developers mailing list > > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > > > > -- > > Hal Finkel > > Assistant Computational Scientist > > Leadership Computing Facility > > Argonne National Laboratory > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory > >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Hal Finkel
2014-Jan-16 05:16 UTC
[LLVMdev] Unaligned load/store for callee-saved 128-bit registers
Ping. Jakob, can you please take a look at this? Francois's suggested fix looks reasonable to me, but as I mentioned below, I wonder if we'd need to better tie together TFI->isStackRealignable() and X86RegisterInfo::canRealignStack (etc.) to avoid breaking things (or at least so that everything will actually work as expected). -Hal ----- Original Message -----> From: "Hal Finkel" <hfinkel at anl.gov> > To: "Francois Pichet" <pichet2000 at gmail.com> > Cc: "Jakob Stoklund Olesen" <jolesen at apple.com>, "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Thursday, November 21, 2013 2:39:44 PM > Subject: Re: [LLVMdev] Unaligned load/store for callee-saved 128-bit registers > > ----- Original Message ----- > > From: "Francois Pichet" <pichet2000 at gmail.com> > > To: "Hal Finkel" <hfinkel at anl.gov> > > Cc: "Chad Rosier" <mcrosier at codeaurora.org>, "Jakob Stoklund > > Olesen" <jolesen at apple.com>, "LLVM Developers Mailing > > List" <llvmdev at cs.uiuc.edu> > > Sent: Thursday, November 21, 2013 2:36:06 PM > > Subject: Re: [LLVMdev] Unaligned load/store for callee-saved > > 128-bit registers > > > > > > BTW I managed to get around this problem by flagging all the > > 128-bit > > registers as caller saved only. > > > > On my system, vector registers are more likely to be used on leaf > > functions anyway. > > > > > > Sounds good; however, unless I'm wrong, this looks like a general > problem that needs to be fixed. > > -Hal > > > > > > > > > > > On Thu, Nov 21, 2013 at 3:24 PM, Hal Finkel < hfinkel at anl.gov > > > wrote: > > > > > > > > > > ----- Original Message ----- > > > From: "Hal Finkel" < hfinkel at anl.gov > > > > To: "Francois Pichet" < pichet2000 at gmail.com > > > > Cc: "LLVM Developers Mailing List" < llvmdev at cs.uiuc.edu > > > > Sent: Monday, November 18, 2013 2:45:53 PM > > > Subject: Re: [LLVMdev] Unaligned load/store for callee-saved > > > 128-bit registers > > > > > > ----- Original Message ----- > > > > From: "Francois Pichet" < pichet2000 at gmail.com > > > > > To: "LLVM Developers Mailing List" < llvmdev at cs.uiuc.edu > > > > > Sent: Monday, November 18, 2013 2:26:30 PM > > > > Subject: [LLVMdev] Unaligned load/store for callee-saved > > > > 128-bit > > > > registers > > > > > > > > > > > > > > > > On my (out-of-tree) target I have 16 128-bit registers. > > > > Unaligned load/store are illegal. (must 16-bytes aligned) > > > > > > > > > > > > > > > > 8 of those registers are defined as callee-saved and 8 > > > > caller-saved. > > > > The default stack size is 4 bytes. > > > > The target implements dynamic stack realign to make sure the > > > > stack > > > > will always be aligned correctly when necessary. > > > > > > > > > > > > Yet I am still getting unaligned load/store when running this > > > > test > > > > case: http://pastie.org/8490604 > > > > > > > > > > > > The problem is in PEI::calculateCalleeSavedRegisters: > > > > > > > > > > > > > > > > // We may not be able to satisfy the desired alignment > > > > specification > > > > of > > > > // the TargetRegisterClass if the stack alignment is smaller. > > > > Use > > > > the > > > > // min. > > > > Align = std::min(Align, StackAlign); > > > > FrameIdx = MFI->CreateStackObject(RC->getSize(), Align, true); > > > > > > > > > > > > This will create unaligned load/store for a callee-saved > > > > 128-bit > > > > register on the frame slot because StackAlign is 4. > > > > > > > > > > > > Adding a check for stack realignable or putting all the 128-bit > > > > registers as caller-save will fix the problem. > > > > > > > > if (!TFI->isStackRealignable()) <--- new line > > > > Align = std::min(Align, StackAlign); > > > > > > > > Is this a bug or am I missing something? > > > > > > > > > > This looks like a bug. By default, isStackRealignable() always > > > returns true (this default comes from the TargetFrameLowering > > > constructor). I wonder, however, is this is not correctly > > > implemented in some backends (X86RegisterInfo::canRealignStack, > > > for > > > example, is not completely trivial). Nadav, do you know how this > > > works? > > > > [Trying some other relevant people...] > > > > Chad, Jakob: thoughts? > > > > -Hal > > > > > > > > > > > > -Hal > > > > > > > > > > > Thanks, > > > > Francois Pichet, Octasic. > > > > > > > > _______________________________________________ > > > > LLVM Developers mailing list > > > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > > > > > > > -- > > > Hal Finkel > > > Assistant Computational Scientist > > > Leadership Computing Facility > > > Argonne National Laboratory > > > _______________________________________________ > > > LLVM Developers mailing list > > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > > > > -- > > Hal Finkel > > Assistant Computational Scientist > > Leadership Computing Facility > > Argonne National Laboratory > > > > > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Jakob Stoklund Olesen
2014-Jan-16 05:41 UTC
[LLVMdev] Unaligned load/store for callee-saved 128-bit registers
On Jan 15, 2014, at 9:16 PM, Hal Finkel <hfinkel at anl.gov> wrote:> Jakob, can you please take a look at this? Francois's suggested fix looks reasonable to me, but as I mentioned below, I wonder if we'd need to better tie together TFI->isStackRealignable() and X86RegisterInfo::canRealignStack (etc.) to avoid breaking things (or at least so that everything will actually work as expected).This is a really complicated problem, and I don't have all the details paged in at the moment. You need to determine fairly early on if you are going to dynamically realign the stack because it affects whether you'll need to reserve a frame pointer and/or a base pointer. SelectionDAGIsel.cpp has a call to MRI.freezeReservedRegs() after instruction selection is complete. That's when you need to make the call. Note that this is long before you know if the register allocator is going to spill any vectors. X86 and ARM will both fall back to unaligned vector spill instructions if the stack is insufficiently aligned, see their respective storeRegToStackSlot implementations. Other targets may have to conservatively assume that vectors will spill if there are any present in the function. It is also possible to reserve the frame pointer if the stack might need realignment. You can then avoid inserting the actual realignment code in PEI if it turns out it wasn't necessary after all. But you're not getting your frame pointer register back, that ship has sailed. This all gets triply hairy if the function contains dynamic allocas as well, and you may need a base pointer register. I don't think there is an easy target-independent fix for this. Thanks, /jakob> ----- Original Message ----- >> From: "Hal Finkel" <hfinkel at anl.gov> >> To: "Francois Pichet" <pichet2000 at gmail.com> >> Cc: "Jakob Stoklund Olesen" <jolesen at apple.com>, "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> >> Sent: Thursday, November 21, 2013 2:39:44 PM >> Subject: Re: [LLVMdev] Unaligned load/store for callee-saved 128-bit registers >> >> ----- Original Message ----- >>> From: "Francois Pichet" <pichet2000 at gmail.com> >>> To: "Hal Finkel" <hfinkel at anl.gov> >>> Cc: "Chad Rosier" <mcrosier at codeaurora.org>, "Jakob Stoklund >>> Olesen" <jolesen at apple.com>, "LLVM Developers Mailing >>> List" <llvmdev at cs.uiuc.edu> >>> Sent: Thursday, November 21, 2013 2:36:06 PM >>> Subject: Re: [LLVMdev] Unaligned load/store for callee-saved >>> 128-bit registers >>> >>> >>> BTW I managed to get around this problem by flagging all the >>> 128-bit >>> registers as caller saved only. >>> >>> On my system, vector registers are more likely to be used on leaf >>> functions anyway. >>> >>> >> >> Sounds good; however, unless I'm wrong, this looks like a general >> problem that needs to be fixed. >> >> -Hal >> >>> >>> >>> >>> >>> On Thu, Nov 21, 2013 at 3:24 PM, Hal Finkel < hfinkel at anl.gov > >>> wrote: >>> >>> >>> >>> >>> ----- Original Message ----- >>>> From: "Hal Finkel" < hfinkel at anl.gov > >>>> To: "Francois Pichet" < pichet2000 at gmail.com > >>>> Cc: "LLVM Developers Mailing List" < llvmdev at cs.uiuc.edu > >>>> Sent: Monday, November 18, 2013 2:45:53 PM >>>> Subject: Re: [LLVMdev] Unaligned load/store for callee-saved >>>> 128-bit registers >>>> >>>> ----- Original Message ----- >>>>> From: "Francois Pichet" < pichet2000 at gmail.com > >>>>> To: "LLVM Developers Mailing List" < llvmdev at cs.uiuc.edu > >>>>> Sent: Monday, November 18, 2013 2:26:30 PM >>>>> Subject: [LLVMdev] Unaligned load/store for callee-saved >>>>> 128-bit >>>>> registers >>>>> >>>>> >>>>> >>>>> On my (out-of-tree) target I have 16 128-bit registers. >>>>> Unaligned load/store are illegal. (must 16-bytes aligned) >>>>> >>>>> >>>>> >>>>> 8 of those registers are defined as callee-saved and 8 >>>>> caller-saved. >>>>> The default stack size is 4 bytes. >>>>> The target implements dynamic stack realign to make sure the >>>>> stack >>>>> will always be aligned correctly when necessary. >>>>> >>>>> >>>>> Yet I am still getting unaligned load/store when running this >>>>> test >>>>> case: http://pastie.org/8490604 >>>>> >>>>> >>>>> The problem is in PEI::calculateCalleeSavedRegisters: >>>>> >>>>> >>>>> >>>>> // We may not be able to satisfy the desired alignment >>>>> specification >>>>> of >>>>> // the TargetRegisterClass if the stack alignment is smaller. >>>>> Use >>>>> the >>>>> // min. >>>>> Align = std::min(Align, StackAlign); >>>>> FrameIdx = MFI->CreateStackObject(RC->getSize(), Align, true); >>>>> >>>>> >>>>> This will create unaligned load/store for a callee-saved >>>>> 128-bit >>>>> register on the frame slot because StackAlign is 4. >>>>> >>>>> >>>>> Adding a check for stack realignable or putting all the 128-bit >>>>> registers as caller-save will fix the problem. >>>>> >>>>> if (!TFI->isStackRealignable()) <--- new line >>>>> Align = std::min(Align, StackAlign); >>>>> >>>>> Is this a bug or am I missing something? >>>>> >>>> >>>> This looks like a bug. By default, isStackRealignable() always >>>> returns true (this default comes from the TargetFrameLowering >>>> constructor). I wonder, however, is this is not correctly >>>> implemented in some backends (X86RegisterInfo::canRealignStack, >>>> for >>>> example, is not completely trivial). Nadav, do you know how this >>>> works? >>> >>> [Trying some other relevant people...] >>> >>> Chad, Jakob: thoughts? >>> >>> -Hal >>> >>> >>> >>>> >>>> -Hal >>>> >>>>> >>>>> Thanks, >>>>> Francois Pichet, Octasic. >>>>> >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>> >>>> >>>> -- >>>> Hal Finkel >>>> Assistant Computational Scientist >>>> Leadership Computing Facility >>>> Argonne National Laboratory >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>> >>> >>> -- >>> Hal Finkel >>> Assistant Computational Scientist >>> Leadership Computing Facility >>> Argonne National Laboratory >>> >>> >> >> -- >> Hal Finkel >> Assistant Computational Scientist >> Leadership Computing Facility >> Argonne National Laboratory >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory