Hi all, I'm currently struggling with a few optimization passes that change stuff I don't want to be changed. However, for the most part those passes (InstructionCombining and SimplifyCFG currently) do stuff that I do want, so disabling them alltogether doesn't help me much. The problem arises because the architecture I'm compiling for is quite non-standard. In particular, it has the ability to execute a lot of instructions in parallel, but at the same time can't execute everything you throw at it. My problem with SimplifyCFG is the following: Whenever the if and else branch start with the same instruction, it gets hoisted up into the predecessor block. For my architecture, instructions in different blocks can't be run in parallel, so this optimization makes code either very inefficient or not compile at all. InstructionCombining has this habit of removing unneeded bits from constants. For example, if I do i & 63, where i is a loop counter that is always even, this gets replaced by i & 62. Which gives, of course, the same results when interpreted, but our backend cannot just use any constant as an & mask (in particular, it can only use a limited amount of them). I'd very much prefer to preserve the original value from the source here (I also assume that this optimization is in place to help further optimizations, because I can't really see any use of this change on regular architectures...). I've been thinking a bit on how to achieve this, and I see a few options: * Use a local patch to simply disable the parts of the passes we don't want, optionally protected by some check to only disable it when it's unwanted. This would be a very effective approach, though also very much unwanted. Local patches to the LLVM source are a pain to maintain. * Use some kind of subclassed pass. Since, AFAICS, simply subclassing an existing pass doesn't really work due to the class ID stuff, this requires making a superclass to do most of the work and a subclass to decide when to do it in LLVM, so we can add another subclass (similar to the Inliner pass). Alternatively, the functionality of instruction combining could be split off into a utility class, though that would prevent using overriding methods to disable some functionality. This approach could be useful, but I can't really see how it would work out yet. * Add options to the current passes. I could add an option to the current passes to make them do what I want (either using an option to createXXXPass() and the constructor, or perhaps using a set_XXX_option() methode or something). This might work for SimplifyCFG, since that option could be made a bit more generic, such as "Don't move instructions between blocks" (leaving SimplifyCFG free to merge blocks whenever appropriate). For InstructionCombining this is harder, since our requirements are not as easily captured in an elegant option, I'm afraid. * Mark instructions / values as immutable. We could write a pass that marks the values we want preserved as immutable and other passes should leave those values alone. This requires quite some modification to LLVM (probably even the IR) and all optimization passes. Though I think it's actually quite an elegant solution, it's probably hard to express everything we need in it (also, if we mark some instruction or value as immutable, it's hard to prevent a pass from making a copy of the instruction (perhaps indirectly) and simply making the immutable instruction unused). * Use some kind of TargetInfo/TargetData struct to control certain optimizations. I'm not really sure how this is used now, but I could imagine that there is some interface for optimization passes to find out what optimizations are worthwile and what are not (something like a bool isBetter(Value* old, Value* new) as a very simple example. Is there already something like this? None of these options seem too attractive to me, what do others think? Is there some other option I'm missing here? Gr. Matthijs -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080526/c5ab4241/attachment.sig>
Matthijs, On May 26, 2008, at 4:23 AM, Matthijs Kooijman wrote:> Hi all, > > I'm currently struggling with a few optimization passes that change > stuff I > don't want to be changed. However, for the most part those passes > (InstructionCombining > and SimplifyCFG currently) do stuff that I do want, so disabling them > alltogether doesn't help me much. > > The problem arises because the architecture I'm compiling for is quite > non-standard. In particular, it has the ability to execute a lot of > instructions in parallel, but at the same time can't execute > everything you > throw at it. > > My problem with SimplifyCFG is the following: Whenever the if and > else branch > start with the same instruction, it gets hoisted up into the > predecessor > block. For my architecture, instructions in different blocks can't > be run in > parallel, so this optimization makes code either very inefficient or > not > compile at all. > > InstructionCombining has this habit of removing unneeded bits from > constants. > For example, if I do i & 63, where i is a loop counter that is > always even, > this gets replaced by i & 62. Which gives, of course, the same > results when > interpreted, but our backend cannot just use any constant as an & > mask (in > particular, it can only use a limited amount of them). I'd very much > prefer to > preserve the original value from the source here (I also assume that > this > optimization is in place to help further optimizations, because I > can't really > see any use of this change on regular architectures...). > > I've been thinking a bit on how to achieve this, and I see a few > options: > * Use a local patch to simply disable the parts of the passes we > don't want, > optionally protected by some check to only disable it when it's > unwanted. > This would be a very effective approach, though also very much > unwanted. > Local patches to the LLVM source are a pain to maintain.Yes, but it works!> * Use some kind of subclassed pass. Since, AFAICS, simply > subclassing an > existing pass doesn't really work due to the class ID stuff, this > requires > making a superclass to do most of the work and a subclass to > decide when to > do it in LLVM, so we can add another subclass (similar to the > Inliner > pass). Alternatively, the functionality of instruction combining > could be > split off into a utility class, though that would prevent using > overriding > methods to disable some functionality. This approach could be > useful, but > I can't really see how it would work out yet. > * Add options to the current passes. I could add an option to the > current > passes to make them do what I want (either using an option to > createXXXPass() and the constructor, or perhaps using a > set_XXX_option() > methode or something). This might work for SimplifyCFG, since that > option > could be made a bit more generic, such as "Don't move instructions > between > blocks" (leaving SimplifyCFG free to merge blocks whenever > appropriate). > > For InstructionCombining this is harder, since our requirements > are not as > easily captured in an elegant option, I'm afraid.In general, I'd like to avoid additional options if possible.> * Mark instructions / values as immutable. We could write a pass > that marks > the values we want preserved as immutable and other passes should > leave > those values alone. This requires quite some modification to LLVM > (probably > even the IR) and all optimization passes. Though I think it's > actually > quite an elegant solution, it's probably hard to express > everything we need > in it (also, if we mark some instruction or value as immutable, > it's hard > to prevent a pass from making a copy of the instruction (perhaps > indirectly) and simply making the immutable instruction unused).Usually "volatile" is one hammer used in such situation to instruct optimizers to stay away. But it is not elegant and smells like hack.> * Use some kind of TargetInfo/TargetData struct to control certain > optimizations. I'm not really sure how this is used now, but I could > imagine that there is some interface for optimization passes to > find out > what optimizations are worthwile and what are not (something like a > bool isBetter(Value* old, Value* new) as a very simple example. Is > there > already something like this?Instruction Combiner uses TargetData. However AFAIK, there is not any general purpose interface available to select optimization based on target.> None of these options seem too attractive to me, what do others > think? Is > there some other option I'm missing here?Would it be possible for write a code gen level pass to sink the instructions to maximize parallel instructions in a block for your target ? - Devang
On May 26, 2008, at 4:23 AM, Matthijs Kooijman wrote:> I'm currently struggling with a few optimization passes that change > stuff I > don't want to be changed.Hehe ok.> However, for the most part those passes (InstructionCombining > and SimplifyCFG currently) do stuff that I do want, so disabling them > alltogether doesn't help me much.Ok.> The problem arises because the architecture I'm compiling for is quite > non-standard. In particular, it has the ability to execute a lot of > instructions in parallel, but at the same time can't execute > everything you > throw at it.Ok, that is odd :)> My problem with SimplifyCFG is the following: Whenever the if and > else branch > start with the same instruction, it gets hoisted up into the > predecessor > block. For my architecture, instructions in different blocks can't > be run in > parallel, so this optimization makes code either very inefficient or > not > compile at all.There are two different issues here. Passes like instcombine and simplifycfg [which is really "basic block combine" :) ] do two things: 1. They make changes that are clear wins, e.g. deleting unconditional branches and noop instrs. 2. They change code into more canonical form. Merging repeated instructions is an important canonicalization because it can unlock other optimizations. The fact that your target doesn't like code in this form is not a good reason for simplifycfg to stop doing it. :)> InstructionCombining has this habit of removing unneeded bits from > constants. > For example, if I do i & 63, where i is a loop counter that is > always even, > this gets replaced by i & 62. Which gives, of course, the same > results when > interpreted, but our backend cannot just use any constant as an & > mask (in > particular, it can only use a limited amount of them).Sure, this is another example of canonicalization. Are you using the LLVM code generator? It has support for handling this specifically. ARM and Alpha in particular have special instructions that only work with very specific and masks. If you write a pattern/instruction that matches (and myreg, 255) for example, this will match a dag node for "(and myreg, 16)" if the code generator knows that the other bits are already zero.> I'd very much prefer to > preserve the original value from the source here (I also assume that > this > optimization is in place to help further optimizations, because I > can't really > see any use of this change on regular architectures...).This is folly. If the user wrote the code in the "optimized" form that instcombine transforms it into, your code generator should still produce the optimized instructions. You're trading one missed optimization for another one.> I've been thinking a bit on how to achieve this, and I see a few > options> :None of these options seem too attractive to me, what do others > think? Is> there some other option I'm missing here?I really don't like any of these options. The best ways to go are: 1) teach your code generator how to do these optimizations, reversing the cases that you care about. 2) if #1 isn't feasible, write a canonicalization prepass (like codegen prepare) that transforms code from the "canonical optimizer form" into a happy form for your target. -Chris
Maybe Matching Threads
- InstCombine doesn't delete instructions with token
- [LLVMdev] Question about shouldMergeGEPs in InstructionCombining
- InstCombine doesn't delete instructions with token
- [LLVMdev] llvm-ld optimization options
- [LLVMdev] problem loading analysis results from Inliner pass