vivek pandya via llvm-dev
2016-Jun-25 17:33 UTC
[llvm-dev] Tail call optimization is getting affected due to local function related optimization with IPRA
Hello LLVM Community, To improve Interprocedural Register Allocation (IPRA) we are trying to force caller saved registers for local functions (which has likage type local). To achive it I have modified TargetFrameLowering::determineCalleeSaves() to return early for function which satisfies if (F->hasLocalLinkage() && !F->hasAddressTaken()) and also reflecting the fact that for local function there are no caller saved registers I am also changing RegUsageInfoCollector.cpp to not to mark regiseters as callee saved in RegMask due to CC with follwoing change in code: if (!F->hasLocalLinkage() || F->hasAddressTaken()) { const uint32_t *CallPreservedMask TRI->getCallPreservedMask(MF, MF.getFunction()->getCallingConv()); // Set callee saved register as preserved. for (unsigned i = 0; i < RegMaskSize; ++i) RegMask[i] = RegMask[i] | CallPreservedMask[i]; } For more details please follow following link. https://groups.google.com/d/msg/llvm-dev/XRzGhJ9wtZg/bYFMzppXEwAJ Now consider following bug due to forcing caller saved registers for local function when IPRA enable: void makewt(int nw, int *ip, double *w) { ... bitrv2(nw, ip, w); } here bitrv2 is local fuction and for that when IPRA enable callee saved registers are set to none. So for that function following is set of collbered register as per regmaks collected by RegUsageInfoCollector pass. Function Name : bitrv2 Clobbered Registers: AH AL AX BH BL BP BPL BX CH CL CX DI DIL EAX EBP EBX ECX EDI EFLAGS ESI ESP RAX RBP RBX RCX RDI RSI RSP SI SIL SP SPL R8 R9 R10 R11 R12 R13 R14 R15 R8B R9B R10B R11B R12B R13B R14B R15B R8D R9D R10D R11D R12D R13D R14D R15D R8W R9W R10W R11W R12W R13W R14W R15W How ever caller of bitrv2, makewt has callee saved registers as per CC, but this code results in segmentation fault when compliled with O1 because makewt has value of *ip in R14 register and that is stored and restore by makewt at begining of call but due to tail call optimization following code is generated and here bitrv2 does not preserve R14 so whwn execution returns to main (which is caller of makewt) value of *ip is gone from R14 (which sould not) and when main calls makewt again then value of *ip (R14) is wrong and result into segmentation fault. Assembly code of makewt: _makewt: ... popq %rbx popq %r12 popq %r13 popq %r14 popq %r15 popq %rbp jmp _bitrv2 ## TAILCALL There is one more case of faluire due to local function related optimization. I am analysing that (sorry for taking more time but I am not much good at assembly). I need some hints for how to solve this. If you feel some problem with my analyses please let me know if you want me to send generated .s file and source .c file. Sincerely, Vivek -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160625/135cfff9/attachment.html>
Eli Friedman via llvm-dev
2016-Jun-25 19:49 UTC
[llvm-dev] Tail call optimization is getting affected due to local function related optimization with IPRA
On Sat, Jun 25, 2016 at 10:33 AM, vivek pandya via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hello LLVM Community, > > To improve Interprocedural Register Allocation (IPRA) we are trying to > force caller > saved registers for local functions (which has likage type local). To > achive it > I have modified TargetFrameLowering::determineCalleeSaves() to return > early for > function which satisfies if (F->hasLocalLinkage() && > !F->hasAddressTaken()) and > also reflecting the fact that for local function there are no caller saved > registers > I am also changing RegUsageInfoCollector.cpp to not to mark regiseters as > callee > saved in RegMask due to CC with follwoing change in code: > > if (!F->hasLocalLinkage() || F->hasAddressTaken()) { > const uint32_t *CallPreservedMask > TRI->getCallPreservedMask(MF, MF.getFunction()->getCallingConv()); > // Set callee saved register as preserved. > for (unsigned i = 0; i < RegMaskSize; ++i) > RegMask[i] = RegMask[i] | CallPreservedMask[i]; > } > > For more details please follow following link. > https://groups.google.com/d/msg/llvm-dev/XRzGhJ9wtZg/bYFMzppXEwAJ > > Now consider following bug due to forcing caller saved registers for local > function > when IPRA enable: > > void makewt(int nw, int *ip, double *w) { > ... > bitrv2(nw, ip, w); > } > > here bitrv2 is local fuction and for that when IPRA enable callee saved > registers > are set to none. So for that function following is set of collbered > register as > per regmaks collected by RegUsageInfoCollector pass. > > Function Name : bitrv2 > Clobbered Registers: > AH AL AX BH BL BP BPL BX CH CL CX DI DIL EAX EBP EBX ECX EDI EFLAGS ESI > ESP RAX > RBP RBX RCX RDI RSI RSP SI SIL SP SPL R8 R9 R10 R11 R12 R13 R14 R15 R8B > R9B R10B > R11B R12B R13B R14B R15B R8D R9D R10D R11D R12D R13D R14D R15D R8W R9W > R10W R11W > R12W R13W R14W R15W > > How ever caller of bitrv2, makewt has callee saved registers as per CC, > but this > code results in segmentation fault when compliled with O1 because makewt > has value > of *ip in R14 register and that is stored and restore by makewt at > begining of call > but due to tail call optimization following code is generated and here > bitrv2 does > not preserve R14 so whwn execution returns to main (which is caller of > makewt) > value of *ip is gone from R14 (which sould not) and when main calls makewt > again > then value of *ip (R14) is wrong and result into segmentation fault. > > Assembly code of makewt: > _makewt: > ... > popq %rbx > popq %r12 > popq %r13 > popq %r14 > popq %r15 > popq %rbp > jmp _bitrv2 ## TAILCALL >If you're tail calling bitrv2 from makewt, bitrv2 has to preserve the value of all the registers which the caller of makewt expects it to preserve, i.e. the callee-save registers according to makewt's calling convention. Otherwise, you're just going to get nonsensical results, like you've discovered. There aren't really that many options here... you can suppress tail call optimization in certain cases, you can suppress your optimization for functions which are tail-called, or you can redirect tail calls to a different entry point from normal calls. Which of those options is best probably depends on the specific code you're trying to optimize. (You aren't allowed to suppress tail call optimization for calls marked musttail .) More generally, it's not clear that shoving all the spill code from the callee into the callers is always beneficial; in some cases, you could end up with a substantial code-size increase without reducing the number of instructions executed at runtime. So you probably need to analyze the callers somehow anyway. -Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160625/aa34d281/attachment.html>
vivek pandya via llvm-dev
2016-Jun-25 19:56 UTC
[llvm-dev] Tail call optimization is getting affected due to local function related optimization with IPRA
On Sat, Jun 25, 2016 at 11:03 PM, vivek pandya <vivekvpandya at gmail.com> wrote:> Hello LLVM Community, > > To improve Interprocedural Register Allocation (IPRA) we are trying to > force caller > saved registers for local functions (which has likage type local). To > achive it > I have modified TargetFrameLowering::determineCalleeSaves() to return > early for > function which satisfies if (F->hasLocalLinkage() && > !F->hasAddressTaken()) and > also reflecting the fact that for local function there are no caller saved > registers > I am also changing RegUsageInfoCollector.cpp to not to mark regiseters as > callee > saved in RegMask due to CC with follwoing change in code: > > if (!F->hasLocalLinkage() || F->hasAddressTaken()) { > const uint32_t *CallPreservedMask > TRI->getCallPreservedMask(MF, MF.getFunction()->getCallingConv()); > // Set callee saved register as preserved. > for (unsigned i = 0; i < RegMaskSize; ++i) > RegMask[i] = RegMask[i] | CallPreservedMask[i]; > } > > For more details please follow following link. > https://groups.google.com/d/msg/llvm-dev/XRzGhJ9wtZg/bYFMzppXEwAJ > > Now consider following bug due to forcing caller saved registers for local > function > when IPRA enable: > > void makewt(int nw, int *ip, double *w) { > ... > bitrv2(nw, ip, w); > } > > here bitrv2 is local fuction and for that when IPRA enable callee saved > registers > are set to none. So for that function following is set of collbered > register as > per regmaks collected by RegUsageInfoCollector pass. > > Function Name : bitrv2 > Clobbered Registers: > AH AL AX BH BL BP BPL BX CH CL CX DI DIL EAX EBP EBX ECX EDI EFLAGS ESI > ESP RAX > RBP RBX RCX RDI RSI RSP SI SIL SP SPL R8 R9 R10 R11 R12 R13 R14 R15 R8B > R9B R10B > R11B R12B R13B R14B R15B R8D R9D R10D R11D R12D R13D R14D R15D R8W R9W > R10W R11W > R12W R13W R14W R15W > > How ever caller of bitrv2, makewt has callee saved registers as per CC, > but this > code results in segmentation fault when compliled with O1 because makewt > has value > of *ip in R14 register and that is stored and restore by makewt at > begining of call > but due to tail call optimization following code is generated and here > bitrv2 does > not preserve R14 so whwn execution returns to main (which is caller of > makewt) > value of *ip is gone from R14 (which sould not) and when main calls makewt > again > then value of *ip (R14) is wrong and result into segmentation fault. > > Assembly code of makewt: > _makewt: > ... > popq %rbx > popq %r12 > popq %r13 > popq %r14 > popq %r15 > popq %rbp > jmp _bitrv2 ## TAILCALL >A very naive solution to this problem come to me is to convert above code to following: _makewt: ... jmp _bitrv2 ## TAILCALL popq %rbx popq %r12 popq %r13 popq %r14 popq %r15 popq %rbp So that when _bitrv2 returns caller will over write callee saved register ( as per CC of that function ) to correct values. I wanted to try it out but I am not able to find correct code where I can do that. -Vivek> > There is one more case of faluire due to local function related > optimization. > I am analysing that (sorry for taking more time but I am not much good at > assembly). > > I need some hints for how to solve this. If you feel some problem with my > analyses > please let me know if you want me to send generated .s file and source .c > file. > > Sincerely, > Vivek >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160626/0f202d67/attachment.html>
vivek pandya via llvm-dev
2016-Jun-26 16:35 UTC
[llvm-dev] Tail call optimization is getting affected due to local function related optimization with IPRA
According to this http://llvm.org/docs/CodeGenerator.html#tail-call-section, it seems that adding a new CC for the purpose of local function optimization seems a good idea because tail call optimization only takes place when both caller and callee have fastcc or GHC or HiPE calling convention. -Vivek On Sun, Jun 26, 2016 at 1:26 AM, vivek pandya <vivekvpandya at gmail.com> wrote:> > > On Sat, Jun 25, 2016 at 11:03 PM, vivek pandya <vivekvpandya at gmail.com> > wrote: > >> Hello LLVM Community, >> >> To improve Interprocedural Register Allocation (IPRA) we are trying to >> force caller >> saved registers for local functions (which has likage type local). To >> achive it >> I have modified TargetFrameLowering::determineCalleeSaves() to return >> early for >> function which satisfies if (F->hasLocalLinkage() && >> !F->hasAddressTaken()) and >> also reflecting the fact that for local function there are no caller >> saved registers >> I am also changing RegUsageInfoCollector.cpp to not to mark regiseters as >> callee >> saved in RegMask due to CC with follwoing change in code: >> >> if (!F->hasLocalLinkage() || F->hasAddressTaken()) { >> const uint32_t *CallPreservedMask >> TRI->getCallPreservedMask(MF, MF.getFunction()->getCallingConv()); >> // Set callee saved register as preserved. >> for (unsigned i = 0; i < RegMaskSize; ++i) >> RegMask[i] = RegMask[i] | CallPreservedMask[i]; >> } >> >> For more details please follow following link. >> https://groups.google.com/d/msg/llvm-dev/XRzGhJ9wtZg/bYFMzppXEwAJ >> >> Now consider following bug due to forcing caller saved registers for >> local function >> when IPRA enable: >> >> void makewt(int nw, int *ip, double *w) { >> ... >> bitrv2(nw, ip, w); >> } >> >> here bitrv2 is local fuction and for that when IPRA enable callee saved >> registers >> are set to none. So for that function following is set of collbered >> register as >> per regmaks collected by RegUsageInfoCollector pass. >> >> Function Name : bitrv2 >> Clobbered Registers: >> AH AL AX BH BL BP BPL BX CH CL CX DI DIL EAX EBP EBX ECX EDI EFLAGS ESI >> ESP RAX >> RBP RBX RCX RDI RSI RSP SI SIL SP SPL R8 R9 R10 R11 R12 R13 R14 R15 R8B >> R9B R10B >> R11B R12B R13B R14B R15B R8D R9D R10D R11D R12D R13D R14D R15D R8W R9W >> R10W R11W >> R12W R13W R14W R15W >> >> How ever caller of bitrv2, makewt has callee saved registers as per CC, >> but this >> code results in segmentation fault when compliled with O1 because makewt >> has value >> of *ip in R14 register and that is stored and restore by makewt at >> begining of call >> but due to tail call optimization following code is generated and here >> bitrv2 does >> not preserve R14 so whwn execution returns to main (which is caller of >> makewt) >> value of *ip is gone from R14 (which sould not) and when main calls >> makewt again >> then value of *ip (R14) is wrong and result into segmentation fault. >> >> Assembly code of makewt: >> _makewt: >> ... >> popq %rbx >> popq %r12 >> popq %r13 >> popq %r14 >> popq %r15 >> popq %rbp >> jmp _bitrv2 ## TAILCALL >> > > A very naive solution to this problem come to me is to convert above code > to following: > > _makewt: > ... > jmp _bitrv2 ## TAILCALL > popq %rbx > popq %r12 > popq %r13 > popq %r14 > popq %r15 > popq %rbp > > So that when _bitrv2 returns caller will over write callee saved register > ( as per CC of that function ) to correct values. > I wanted to try it out but I am not able to find correct code where I can > do that. > -Vivek > >> >> There is one more case of faluire due to local function related >> optimization. >> I am analysing that (sorry for taking more time but I am not much good at >> assembly). >> >> I need some hints for how to solve this. If you feel some problem with my >> analyses >> please let me know if you want me to send generated .s file and source .c >> file. >> >> Sincerely, >> Vivek >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160626/7e2c5d16/attachment.html>
Possibly Parallel Threads
- Tail call optimization is getting affected due to local function related optimization with IPRA
- Tail call optimization is getting affected due to local function related optimization with IPRA
- Tail call optimization is getting affected due to local function related optimization with IPRA
- Tail call optimization is getting affected due to local function related optimization with IPRA
- Tail call optimization is getting affected due to local function related optimization with IPRA