Hi all, I have created a patch (up for review at: http://reviews.llvm.org/D17386) that does Loop Fusion implementation. Approach: Legality: Currently it can fuse two adjacent loops whose iteration spaces are same and are at same depth. Dependence legality: Currently, dependence legality cannot be checked across loops. Hence the loops are cloned along a versioned path, unconditionally fused along that path and then the dependence legality is checked on the fused loop keeping the instructions from original loops in context. Fusion is illegal if there is a backward dependence between memory accesses whose source was in first loop and sink was in second loop. Currently, LoopAccessAnalysis is used to check dependence legality. A basic diagram below tries to explain the approach taken to test dependence legality on two adjacent loops (L1 and L2). L1PH (PH: Preheader) | L1 | CB (L1Exit/L2PH: ConnectingBlock (CB) ) | L2 | L2Exit is versioned as: BooleanBB /\ L1PH L1PH.clone | | L1 L1.clone | | CB CB.clone | | L2 L2.clone \ / L2Exit And fused as: BooleanBB /\ L1PH FusedPH | | L1 L1Blocks | | \ CB L2Blocks | | | |/ L2 | \ / CommonExit Profitability: Yet to be added. Further, based on legality and profitability success, the fused loop is either retained or removed. If runtime checks are necessary, both original and fused loops are retained; otherwise the original loops are removed. Currently, I have scheduled the fusion pass after distribution pass. Such a schedule negates the effect of the other pass, but given that the distribution (and fusion) pass is experimental and off by default, I felt it was okay to schedule that way till a global profitability is implemented. Please share your feedback about the design and implementation. Thank you -- Good time... Vikram TV CompilerTree Technologies Mysore, Karnataka, INDIA -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160218/fdde60e2/attachment.html>
Hi Vikram,> On Feb 18, 2016, at 9:21 AM, Vikram TV via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hi all, > > I have created a patch (up for review at: http://reviews.llvm.org/D17386 <http://reviews.llvm.org/D17386>) that does Loop Fusion implementation. > > Approach: > Legality: Currently it can fuse two adjacent loops whose iteration spaces are same and are at same depth. > > Dependence legality: Currently, dependence legality cannot be checked across loops. Hence the loops are cloned along a versioned path, unconditionally fused along that path and then the dependence legality is checked on the fused loop keeping the instructions from original loops in context. Fusion is illegal if there is a backward dependence between memory accesses whose source was in first loop and sink was in second loop. > Currently, LoopAccessAnalysis is used to check dependence legality.Thanks for writing up the design here. I think we have a pretty strong policy against creating temporary instructions and here you actually create an entire loop just to check legality. It would probably be a better design to add the capability of adding two LAI objects together. This would effectively simulate the fusion on the analysis side so you could query the legality from that. Specifically, you could check if you have backward dependences between instructions in L2 to instructions in L1 which would be illegal. As a side effect you’d also get the total set of memchecks which you could filter to only include checks where the participating pointers come from different loops. (This is quite similar to LoopDistribution.) Also I don’t think it should be too hard to teach LVer to be able to version two consecutive loops (or arbitrary CFG?). Let me know what you think, Adam> > A basic diagram below tries to explain the approach taken to test dependence legality on two adjacent loops (L1 and L2). > > L1PH (PH: Preheader) > | > L1 > | > CB (L1Exit/L2PH: ConnectingBlock (CB) ) > | > L2 > | > L2Exit > > is versioned as: > > BooleanBB > /\ > L1PH L1PH.clone > | | > L1 L1.clone > | | > CB CB.clone > | | > L2 L2.clone > \ / > L2Exit > > And fused as: > > BooleanBB > /\ > L1PH FusedPH > | | > L1 L1Blocks > | | \ > CB L2Blocks | > | | |/ > L2 | > \ / > CommonExit > > Profitability: Yet to be added. > > Further, based on legality and profitability success, the fused loop is either retained or removed. If runtime checks are necessary, both original and fused loops are retained; otherwise the original loops are removed. > > Currently, I have scheduled the fusion pass after distribution pass. Such a schedule negates the effect of the other pass, but given that the distribution (and fusion) pass is experimental and off by default, I felt it was okay to schedule that way till a global profitability is implemented. > > Please share your feedback about the design and implementation. > > Thank you > -- > > Good time... > Vikram TV > CompilerTree Technologies > Mysore, Karnataka, INDIA > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160218/f6511aa7/attachment.html>
Hi, Thanks for the reply. Few thoughts inlined. On Fri, Feb 19, 2016 at 8:00 AM, Adam Nemet <anemet at apple.com> wrote:> Hi Vikram, > > On Feb 18, 2016, at 9:21 AM, Vikram TV via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > Hi all, > > I have created a patch (up for review at: http://reviews.llvm.org/D17386) > that does Loop Fusion implementation. > > Approach: > Legality: Currently it can fuse two adjacent loops whose iteration spaces > are same and are at same depth. > > Dependence legality: Currently, dependence legality cannot be checked > across loops. Hence the loops are cloned along a versioned path, > unconditionally fused along that path and then the dependence legality is > checked on the fused loop keeping the instructions from original loops in > context. Fusion is illegal if there is a backward dependence between memory > accesses whose source was in first loop and sink was in second loop. > Currently, LoopAccessAnalysis is used to check dependence legality. > > > Thanks for writing up the design here. > > I think we have a pretty strong policy against creating temporary > instructions and here you actually create an entire loop just to check > legality. >I didn't understand the consequences here. A subsequent DCE pass or explicit removal in this case is taking care of the temporaries. Any pointers in this regard would be helpful.> > It would probably be a better design to add the capability of adding two > LAI objects together. This would effectively simulate the fusion on the > analysis side so you could query the legality from that. >I am not sure how the underlying analysis like SCEV would behave in this case. As per my understanding, it queries for a particular loop while we have populated accesses from two different loops. But assuming that it works, we would lose the ability to try/test using DependenceAnalysis in future. Currently, it is very easy to replace LoopAccessAnalysis with DependenceAnalysis.> > Specifically, you could check if you have backward dependences between > instructions in L2 to instructions in L1 which would be illegal. > > As a side effect you’d also get the total set of memchecks which you could > filter to only include checks where the participating pointers come from > different loops. (This is quite similar to LoopDistribution.) >I am happy to add a routine in a subsequent patch that filter the checks.> > Also I don’t think it should be too hard to teach LVer to be able to > version two consecutive loops (or arbitrary CFG?). >I think yes. Instead of Loop Versioning deciding to version, code can be factored out so that it versions unconditionally "also" as requested by the pass that uses it.> > Let me know what you think, > Adam > > > A basic diagram below tries to explain the approach taken to test > dependence legality on two adjacent loops (L1 and L2). > > L1PH (PH: Preheader) > | > L1 > | > CB (L1Exit/L2PH: ConnectingBlock (CB) ) > | > L2 > | > L2Exit > > is versioned as: > > BooleanBB > /\ > L1PH L1PH.clone > | | > L1 L1.clone > | | > CB CB.clone > | | > L2 L2.clone > \ / > L2Exit > > And fused as: > > BooleanBB > /\ > L1PH FusedPH > | | > L1 L1Blocks > | | \ > CB L2Blocks | > | | |/ > L2 | > \ / > CommonExit > > Profitability: Yet to be added. > > Further, based on legality and profitability success, the fused loop is > either retained or removed. If runtime checks are necessary, both original > and fused loops are retained; otherwise the original loops are removed. > > Currently, I have scheduled the fusion pass after distribution pass. Such > a schedule negates the effect of the other pass, but given that the > distribution (and fusion) pass is experimental and off by default, I felt > it was okay to schedule that way till a global profitability is implemented. > > Please share your feedback about the design and implementation. > > Thank you > -- > > Good time... > Vikram TV > CompilerTree Technologies > Mysore, Karnataka, INDIA > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > >-- Good time... Vikram TV CompilerTree Technologies Mysore, Karnataka, INDIA -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160219/60d2ee4d/attachment.html>