Arsenault, Matthew via llvm-dev
2019-Nov-14 08:20 UTC
[llvm-dev] imm COPY generated by PHI elim not propagated
In this case the load imm is foldable into the copy, once converted to a mov. Directly folding this would be 4 v_mov_b32 instead of 5 produced currently -Matt On 11/14/19, 07:20, "llvm-dev on behalf of Quentin Colombet via llvm-dev" <llvm-dev-bounces at lists.llvm.org on behalf of llvm-dev at lists.llvm.org> wrote: Hi Ryan, Unless you can fold your immediate directly in an instruction, it is actually not profitable to propagate them. Indeed you will end up with a bunch of load imm instead of reusing a register that already hold this value. The way it works right now is, if holding this value in a register is too expensive, i.e., it triggers a spill, then we rematerialize the immediate instead of holding a register for it. Cheers, -Quentin > On Nov 13, 2019, at 7:36 AM, Ryan Taylor via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > I have some code such that: > > vgpr1 = mov 0 > branch bb > bb: > PHI vgpr2 = vgpr1, …. > PHI vgpr3 = vgpr1, …. > PHI vgpr4 = vgpr1, …. > PHI vgpr5 = vgpr1, …. > > PHI node elimination is generating copies for all these PHIs (and hoisting them) as such: > > vgpr1 = 0 > vgpr20 = COPY vgpr1 // old vgpr2 > vgpr30 = COPY vgpr1 // old vgpr3 > vgpr40 = COPY vgpr1 // old vgpr4 > vgpr 50 = COPY vgprt1 // old vgpr5 > > I expect the zero to get propagated in a later phase but it's not. I was looking at adding immediate folding to the register coalescer but this doesn't really seem like the right place. Any suggestions? > > I'm sort of surprised that other targets haven't run into this issue. > > -Ryan > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Quentin Colombet via llvm-dev
2019-Nov-14 17:15 UTC
[llvm-dev] imm COPY generated by PHI elim not propagated
That sounds like the folding could be done when you expand the copy in expand pseudo after regalloc.> On Nov 14, 2019, at 12:20 AM, Arsenault, Matthew <Matthew.Arsenault at amd.com> wrote: > > In this case the load imm is foldable into the copy, once converted to a mov. Directly folding this would be 4 v_mov_b32 instead of 5 produced currently > > -Matt > > On 11/14/19, 07:20, "llvm-dev on behalf of Quentin Colombet via llvm-dev" <llvm-dev-bounces at lists.llvm.org on behalf of llvm-dev at lists.llvm.org> wrote: > > Hi Ryan, > > Unless you can fold your immediate directly in an instruction, it is actually not profitable to propagate them. Indeed you will end up with a bunch of load imm instead of reusing a register that already hold this value. > > The way it works right now is, if holding this value in a register is too expensive, i.e., it triggers a spill, then we rematerialize the immediate instead of holding a register for it. > > Cheers, > -Quentin > >> On Nov 13, 2019, at 7:36 AM, Ryan Taylor via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> I have some code such that: >> >> vgpr1 = mov 0 >> branch bb >> bb: >> PHI vgpr2 = vgpr1, …. >> PHI vgpr3 = vgpr1, …. >> PHI vgpr4 = vgpr1, …. >> PHI vgpr5 = vgpr1, …. >> >> PHI node elimination is generating copies for all these PHIs (and hoisting them) as such: >> >> vgpr1 = 0 >> vgpr20 = COPY vgpr1 // old vgpr2 >> vgpr30 = COPY vgpr1 // old vgpr3 >> vgpr40 = COPY vgpr1 // old vgpr4 >> vgpr 50 = COPY vgprt1 // old vgpr5 >> >> I expect the zero to get propagated in a later phase but it's not. I was looking at adding immediate folding to the register coalescer but this doesn't really seem like the right place. Any suggestions? >> >> I'm sort of surprised that other targets haven't run into this issue. >> >> -Ryan >> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >
Ryan Taylor via llvm-dev
2019-Nov-15 19:16 UTC
[llvm-dev] imm COPY generated by PHI elim not propagated
This would require getting the reaching definition which requires live intervals analysis. On Thu, Nov 14, 2019 at 12:15 PM Quentin Colombet <qcolombet at apple.com> wrote:> That sounds like the folding could be done when you expand the copy in > expand pseudo after regalloc. > > > On Nov 14, 2019, at 12:20 AM, Arsenault, Matthew < > Matthew.Arsenault at amd.com> wrote: > > > > In this case the load imm is foldable into the copy, once converted to a > mov. Directly folding this would be 4 v_mov_b32 instead of 5 produced > currently > > > > -Matt > > > > On 11/14/19, 07:20, "llvm-dev on behalf of Quentin Colombet via > llvm-dev" <llvm-dev-bounces at lists.llvm.org on behalf of > llvm-dev at lists.llvm.org> wrote: > > > > Hi Ryan, > > > > Unless you can fold your immediate directly in an instruction, it is > actually not profitable to propagate them. Indeed you will end up with a > bunch of load imm instead of reusing a register that already hold this > value. > > > > The way it works right now is, if holding this value in a register is > too expensive, i.e., it triggers a spill, then we rematerialize the > immediate instead of holding a register for it. > > > > Cheers, > > -Quentin > > > >> On Nov 13, 2019, at 7:36 AM, Ryan Taylor via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> > >> I have some code such that: > >> > >> vgpr1 = mov 0 > >> branch bb > >> bb: > >> PHI vgpr2 = vgpr1, …. > >> PHI vgpr3 = vgpr1, …. > >> PHI vgpr4 = vgpr1, …. > >> PHI vgpr5 = vgpr1, …. > >> > >> PHI node elimination is generating copies for all these PHIs (and > hoisting them) as such: > >> > >> vgpr1 = 0 > >> vgpr20 = COPY vgpr1 // old vgpr2 > >> vgpr30 = COPY vgpr1 // old vgpr3 > >> vgpr40 = COPY vgpr1 // old vgpr4 > >> vgpr 50 = COPY vgprt1 // old vgpr5 > >> > >> I expect the zero to get propagated in a later phase but it's not. I > was looking at adding immediate folding to the register coalescer but this > doesn't really seem like the right place. Any suggestions? > >> > >> I'm sort of surprised that other targets haven't run into this issue. > >> > >> -Ryan > >> > >> > >> _______________________________________________ > >> LLVM Developers mailing list > >> llvm-dev at lists.llvm.org > >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191115/4de64fec/attachment-0001.html>