Dan Gohman
2008-May-07  00:56 UTC
[LLVMdev] [PATCH] Split LoopUnroll pass into mechanism and policy
Hello Matthijs, Separating mechanism from policy is a good thing for the LoopUnroll pass. Instead of moving the policy to a subclass though, I think it'd be better to move the mechanism, the unrollLoop function, out to be a standalone utility function, with the LoopInfo object passed in explicitly. FoldBlockIntoPredecessor would also be good to make into a standalone utility function, since it isn't specific to unrolling. This should still make it easy to write new unrolling passes with custom heuristics, but it would also be more flexible for passes to do unrolling in combination with other transformations. What do you think? Dan On May 6, 2008, at 3:11 AM, Matthijs Kooijman wrote:> Hi, > > please find an updated patch attached that incorporates the > (trivial) changes > introduced by r50696 and which makes this patch apply cleanly again. > > Gr. > > Matthijs > <split-unroll.diff>_______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Devang Patel
2008-May-07  03:30 UTC
[LLVMdev] [PATCH] Split LoopUnroll pass into mechanism and policy
Dan, On May 6, 2008, at 5:56 PM, Dan Gohman wrote:> This should still make it easy to write new unrolling passes with > custom heuristics, but it would also be more flexible for passes to > do unrolling in combination with other transformations.Do you mean unrolling during other transformation or before/after other transformation ? Do you have specific example in mind ? Thanks, - Devang
Daniel Berlin
2008-May-07  06:06 UTC
[LLVMdev] [PATCH] Split LoopUnroll pass into mechanism and policy
On Tue, May 6, 2008 at 11:30 PM, Devang Patel <dpatel at apple.com> wrote:> Dan, > > > On May 6, 2008, at 5:56 PM, Dan Gohman wrote: > > > This should still make it easy to write new unrolling passes with > > custom heuristics, but it would also be more flexible for passes to > > do unrolling in combination with other transformations. > > Do you mean unrolling during other transformation or before/after > other transformation ? Do you have specific example in mind ? > Thanks,Dunno what he intends, but doing unrolling during other transformations is useful for things like SLP style vectorization, etc.
Matthijs Kooijman
2008-May-07  08:08 UTC
[LLVMdev] [PATCH] Split LoopUnroll pass into mechanism and policy
Hi Dan,> This should still make it easy to write new unrolling passes with > custom heuristics, but it would also be more flexible for passes to > do unrolling in combination with other transformations.Agreed. To shed some light on my specific requirements, I am working with an architecture that has only limited support for loops and control flow. This means that loop unrolling is not a matter of performance, but any loop that can't be mapped on the architecture must be unrolled to be able to work. So, I don't really need the added flexibility, though I agree that it is a good idea nonetheless. Splitting off the second part of unrollLoop (the actual unrolling, after deciding to unroll the loop) shouldn't be a problem. I am a bit concerned about the first part (before the decision), which finds out the trip count and trip multiple. Since both policy and mechanism need these values, that code must be shared somehow as well. I see two options for that: 1. Put two methods in the utility class, each of which return one of those values. 2. Add two methods to the Loop class. Currently, there is a getTripCount() method that returns the trip count as a Value*. Perhaps we could add a getTripCountConst() (or getTripCountInt() ?) and getTripCountMultiple() to Loop? It seems to me the second option is nicer, though it shouldn't make too much of a difference. Furthermore, where to put this utility class? lib/Transforms/UnrollLoop.cpp seems reasonable? Gr. Matthijs -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080507/e886c3b4/attachment.sig>
Dan Gohman
2008-May-07  20:08 UTC
[LLVMdev] [PATCH] Split LoopUnroll pass into mechanism and policy
On May 7, 2008, at 1:08 AM, Matthijs Kooijman wrote:> Hi Dan, > >> This should still make it easy to write new unrolling passes with >> custom heuristics, but it would also be more flexible for passes to >> do unrolling in combination with other transformations. > Agreed. > > To shed some light on my specific requirements, I am working with an > architecture that has only limited support for loops and control > flow. This > means that loop unrolling is not a matter of performance, but any > loop that > can't be mapped on the architecture must be unrolled to be able to > work. So, > I don't really need the added flexibility, though I agree that it is > a good > idea nonetheless.Ok.> > > > Splitting off the second part of unrollLoop (the actual unrolling, > after > deciding to unroll the loop) shouldn't be a problem. I am a bit > concerned > about the first part (before the decision), which finds out the trip > count and > trip multiple. Since both policy and mechanism need these values, > that code > must be shared somehow as well. I see two options for that: > 1. Put two methods in the utility class, each of which return one of > those > values. > 2. Add two methods to the Loop class. Currently, there is a > getTripCount() > method that returns the trip count as a Value*. Perhaps we could > add a > getTripCountConst() (or getTripCountInt() ?) and > getTripCountMultiple() to > Loop?> > > It seems to me the second option is nicer, though it shouldn't make > too much > of a difference.The second option sounds nicer to me too. Is the name getSmallConstantTripCount too paranoid?> > > Furthermore, where to put this utility class? lib/Transforms/ > UnrollLoop.cpp > seems reasonable?Transform utilities can go under lib/Transforms/Utils. :-) Dan
Dan Gohman
2008-May-07  21:59 UTC
[LLVMdev] [PATCH] Split LoopUnroll pass into mechanism and policy
On May 6, 2008, at 8:30 PM, Devang Patel wrote:> Dan, > > On May 6, 2008, at 5:56 PM, Dan Gohman wrote: > >> This should still make it easy to write new unrolling passes with >> custom heuristics, but it would also be more flexible for passes to >> do unrolling in combination with other transformations. > > Do you mean unrolling during other transformation or before/after > other transformation ? Do you have specific example in mind ?Another example is the combined unroll and prefetch technique described in the AMD 5.5 Dan
Matthijs Kooijman
2008-May-09  10:47 UTC
[LLVMdev] [PATCH] Split LoopUnroll pass into mechanism and policy
Hi All,
the attached patch performs the splitting in the proposed manner.
before applying the patch, please execute
	svn cp lib/Transforms/Scalar/LoopUnroll.cpp lib/Transforms/Utils/UnrollLoop.cpp
to make the patch apply and preserve proper history.
Transforms/Utils/UnrollLoop.cpp contains the unrollLoop function, which is now
used by the LoopUnroll pass. I've also moved the RemapInstruction and
FoldBlockIntoPredecessor support functions there.
Additionally, I've also moved the loop unrolling statistics into the Utils
file. Is
that a good idea?
I've added two new methods to Loop (or LoopBase<> really):
getSmallConstantTripCount() and getSmallConstantTripMultiple(), which return
the trip count and multiple as a normal unsigned value.
There are a number of remaining issues, though. Firstly, I observe some code
duplication. Every pass that performs loop unrolling, should depend on the
LoopSimplify, LCSSA and LoopInfo passes. Also, the UnrollLoop code always
preservers the latter two passes. This means a large part of the
getAnalysisUsage for every unroll pass will be copy-pasted. This could be
solved by adding an extra "defaultAnalysisUsage()" function to
Utils/UnrollLoop.cpp or something similar, but I'm not sure that's very
pretty.
Any suggestions?
Additionally, after loop unrolling, the following check is made:
  // Update the loop information for this loop.
  // If we completely unrolled the loop, remove it from the parent.
  if (L->getNumBackEdges() == 0)
    LPM.deleteLoopFromQueue(L);
This check probably needs to happen in every unroll pass that use loopUnroll.
We could move this check into the loopUnroll function, but that requires
passing in the LoopPassManager (LPM) to loopUnroll as well. Is that wanted?
Overall, I've really only moved code, so behaviour should be unchanged. The
only functional change I think I made, was that the old code checked to see if
the terminator instruction of the latch block was a conditional branch first,
before looking at the trip count and the treshold. Since that check moved
together with unrollLoop, it is done a bit later. However, AFAICS, none of the
code (getTripCount, code size estimation) actually depends on this check, so it
shouldn't change anything in the end.
Lastly, the (original comments) say "This pass will multi-block loops only
if they
contain no non-unrolled subloops". I can't really find what this is
supposed to
mean, in particular I can't find any explicit check for subloops anywhere.
Anyone know what this means?
I've also tried to minimize the includes required in both cpp files, seems
like
there were quite some unused ones present.
The new code seems to work on an adhoc test case, but there seems to be a
problem with one of the standard testcases. I'll solve that later today. For
now, please review the patch :-)
Gr.
Matthijs
-------------- next part --------------
A non-text attachment was scrubbed...
Name: unroll-split.diff
Type: text/x-diff
Size: 31752 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20080509/2312c080/attachment.diff>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20080509/2312c080/attachment.sig>
Matthijs Kooijman
2008-May-09  12:33 UTC
[LLVMdev] [PATCH] Split LoopUnroll pass into mechanism and policy
Hi All,> The new code seems to work on an adhoc test case, but there seems to be a > problem with one of the standard testcases. I'll solve that later today. For > now, please review the patch :-)new patch is attached. There was a small logic error, which is now fixed by making getSmallConstantTripMultiple return 1 when it doesn't have any useful result (instead of 0). Makes more sense, since the trip count is always a multiple of 1 :-) This only leaves one failing test case (test/CodeGen/X86/vec_set-5.ll), but I don't think it's related to this patch. Gr. Matthijs -------------- next part -------------- A non-text attachment was scrubbed... Name: unroll-split.diff Type: text/x-diff Size: 31753 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080509/7c154c8c/attachment.diff> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080509/7c154c8c/attachment.sig>
Dan Gohman
2008-May-09  21:04 UTC
[LLVMdev] [PATCH] Split LoopUnroll pass into mechanism and policy
Hello Matthijs, On May 9, 2008, at 3:47 AM, Matthijs Kooijman wrote:> Hi All, > > the attached patch performs the splitting in the proposed manner. > before applying the patch, please execute > svn cp lib/Transforms/Scalar/LoopUnroll.cpp lib/Transforms/Utils/ > UnrollLoop.cpp > to make the patch apply and preserve proper history. > > Transforms/Utils/UnrollLoop.cpp contains the unrollLoop function, > which is now > used by the LoopUnroll pass. I've also moved the RemapInstruction and > FoldBlockIntoPredecessor support functions there. > > Additionally, I've also moved the loop unrolling statistics into the > Utils file. Is > that a good idea? > > I've added two new methods to Loop (or LoopBase<> really): > getSmallConstantTripCount() and getSmallConstantTripMultiple(), > which return > the trip count and multiple as a normal unsigned value. > > There are a number of remaining issues, though. Firstly, I observe > some code > duplication. Every pass that performs loop unrolling, should depend > on the > LoopSimplify, LCSSA and LoopInfo passes. Also, the UnrollLoop code > always > preservers the latter two passes. This means a large part of the > getAnalysisUsage for every unroll pass will be copy-pasted. This > could be > solved by adding an extra "defaultAnalysisUsage()" function to > Utils/UnrollLoop.cpp or something similar, but I'm not sure that's > very pretty. > Any suggestions?I'm not sure that's very pretty either :-}. My sense is that it's better to just duplicate those few lines. The unrollLoop function verifies isLCSSAForm with an assert, and LoopSimplify and LoopInfo are required by pretty much every loop pass. For the addPreserved lines, I guess unrollLoop should just have documented the passes it preserves. An assert at the end that checks that if the loop wasn't completely unrolled that it's still isLCSSAForm would be a good addition in any case.> > > Additionally, after loop unrolling, the following check is made: > // Update the loop information for this loop. > // If we completely unrolled the loop, remove it from the parent. > if (L->getNumBackEdges() == 0) > LPM.deleteLoopFromQueue(L); > > This check probably needs to happen in every unroll pass that use > loopUnroll. > We could move this check into the loopUnroll function, but that > requires > passing in the LoopPassManager (LPM) to loopUnroll as well. Is that > wanted?I don't think it's desirable to require a LoopPassManager object, but I think it would be fine to have an optional LoopPassManager parameter which can be null for this purpose. If a LoopPassManager object is provided and the loop is completely unrolled, the loop is removed from the LoopPassManager.> > > Overall, I've really only moved code, so behaviour should be > unchanged. The > only functional change I think I made, was that the old code checked > to see if > the terminator instruction of the latch block was a conditional > branch first, > before looking at the trip count and the treshold. Since that check > moved > together with unrollLoop, it is done a bit later. However, AFAICS, > none of the > code (getTripCount, code size estimation) actually depends on this > check, so it > shouldn't change anything in the end.Ok.> > > Lastly, the (original comments) say "This pass will multi-block > loops only if they > contain no non-unrolled subloops". I can't really find what this is > supposed to > mean, in particular I can't find any explicit check for subloops > anywhere. > Anyone know what this means?I don't.> > I've also tried to minimize the includes required in both cpp files, > seems like > there were quite some unused ones present.Yep, it's not uncommon for unused includes to accumulate as the code evolves. Thanks for cleaning that up.> > > The new code seems to work on an adhoc test case, but there seems to > be a > problem with one of the standard testcases. I'll solve that later > today. For > now, please review the patch :-)+ /// getSmallConstantTripCount - Returns the trip count of this loop as a + /// normal unsigned value, if possible. Returns 0 if the trip count is unknown + /// of not constant. Will also return 0 if the trip count is very large + /// (> 2^32) This should say (>= 2^32). + inline unsigned getSmallConstantTripCount() const { + Value* TripCount = this->getTripCount(); + if (TripCount) { + if (ConstantInt *TripCountC = dyn_cast<ConstantInt>(TripCount)) { + // Guard against huge trip counts. This also guards against assertions in + // APInt from the use of getZExtValue, below. This comment is a little stale now. Just say it guards against huge trip counts. + if (TripCountC->getValue().getActiveBits() <= 32) { + return (unsigned)TripCountC->getZExtValue(); + } + } + } + return 0; + } + /// getSmallConstantTripMultiple - Returns the largest constant divisor of the + /// trip count of this loop as a normal unsigned value, if possible. This + /// means that the actual trip count is always a multiple of the returned + /// value (don't forget it could very well be zero as well!). I know you corrected this elsewhere, but this comment still says the multiple could be zero. + /// Returns 1 if the trip count is unknown or not guaranteed to be the + /// multiple of a constant (which is also the case if the trip count is simply + /// constant, use getSmallConstantTripCount for that case), Will also return 1 + /// if the trip count is very large (> 2^32). I know you're just refactoring existing code, but as a standalone utility getSmallConstantTripMultiple should really handle the case where the trip count is constant. Eventually, it should be taught a few more operators than just Mul too, such as Shl and And, but you don't need to worry about that now. And again >= instead of >. --- include/llvm/Transforms/Utils/UnrollLoop.h (revision 0) +++ include/llvm/Transforms/Utils/UnrollLoop.h (revision 0) This file misses the comments that all LLVM headers have. We may someday add more loop utilities, in which case we may want to rename this header and add other stuff to it, but we can worry about that when the need arrises. This is fine for now. +bool unrollLoop(Loop *L, unsigned Count, LoopInfo* LI); Please capitalize this to UnrollLoop. LLVM is a little inconsistent about capitalization in some ways, but standalone transforming utility functions are pretty consistently capitalized. Thanks! Dan
Reasonably Related Threads
- [LLVMdev] [PATCH] Split LoopUnroll pass into mechanism and policy
- [LLVMdev] [PATCH] Split LoopUnroll pass into mechanism and policy
- [LLVMdev] [PATCH] Split LoopUnroll pass into mechanism and policy
- [LLVMdev] [PATCH] Split LoopUnroll pass into mechanism and policy
- [LLVMdev] [PATCH] Split LoopUnroll pass into mechanism and policy