Edward Lee
2008-Oct-08 16:59 UTC
[LLVMdev] Lost instcombine opportunity: "or"s of "icmp"s (commutability)
instcombine can handle certain orders of "icmp"s that are "or"ed together: x != 5 OR x > 10 OR x == 8 becomes.. x != 5 OR x == 8 becomes.. x != 5 However, a different ordering prevents the simplification: x == 8 OR x > 10 OR x != 5 becomes.. %or.eq8.gt10 OR x != 5 and that can't be simplified because we now have an "or" OR "icmp". What would I need to implement to restore the commutative property? Perhaps a first stab would be to take (A|B)|C create two binaryOp A|C and B|C and recursively call visitOr on each of them to see if they simplify. Using the example above.. Before: %or.eq8.gt10 = .. ; [uses=1] %res = or %or.eq8.gt10, %ne5 ; original instruction After: %or.eq8.gt10 = .. ; [uses=0] %or.eq8.ne5 = %ne5 ; instcombine recursively simplified this [uses=1 or 0 see next] %res = or %or.eq8.ne5, %gt10 ; even better: %res = or %ne5, %gt10 The recursive call of A|C would also let C see further "or"s, so if A is actually composed of (a1|a2), it could potentially try to simplify a1|C and a2|C. I'm not entirely sure right now, but potentially there could be issues of creating too many additional "or" instructions. E.g., %or.eq.8.gt10 originally had more than 1 use. Am I on the right track (or does LLVM already support this in another optimization pass?) Ed
Edward Lee
2008-Oct-08 22:58 UTC
[LLVMdev] [PATCH] Lost instcombine opportunity: "or"s of "icmp"s (commutability)
Here's an initial stab, but I'm not too happy about the temporarily adding new instructions then removing it because returning it will have it added back in to replace other uses. I also added a couple test cases pass with the new InstructionCombining changes (the old code only passes one of the added tests). Also, this change exposes some simplification for test/Transforms/InstCombine/2006-05-06-Infloop.ll. The original code truncates two i32 values to i8 and eventually ORs them together. The patch has it optimized to OR the two values first then truncate a single value. %tmp264not = xor i8 %tmp264, -1 ; <i8> [#uses=1] %tmp212 = trunc i32 %iftmp.36.0 to i8 ; <i8> [#uses=2] %tmp265 = trunc i32 %tmp261 to i8 ; <i8> [#uses=1] %tmp266 = or i8 %tmp212, %tmp264not ; <i8> [#uses=2] %tmp268 = or i8 %tmp266, %tmp265 ; <i8> [#uses=1] %tmp264not = xor i8 %tmp264, -1 ; <i8> [#uses=2] %a.c45 = or i32 %iftmp.36.0, %tmp261 ; <i32> [#uses=1] trunc i32 %a.c45 to i8 ; <i8>:0 [#uses=1] %tmp268 = or i8 %0, %tmp264not ; <i8> [#uses=1] Ed On Wed, Oct 8, 2008 at 11:59 AM, Edward Lee <eslee3 at uiuc.edu> wrote:> instcombine can handle certain orders of "icmp"s that are "or"ed together: > x != 5 OR x > 10 OR x == 8 becomes.. > x != 5 OR x == 8 becomes.. > x != 5 > > However, a different ordering prevents the simplification: > x == 8 OR x > 10 OR x != 5 becomes.. > %or.eq8.gt10 OR x != 5 > and that can't be simplified because we now have an "or" OR "icmp". > > What would I need to implement to restore the commutative property? > > Perhaps a first stab would be to take (A|B)|C create two binaryOp A|C > and B|C and recursively call visitOr on each of them to see if they > simplify. Using the example above.. > > Before: > %or.eq8.gt10 = .. ; [uses=1] > %res = or %or.eq8.gt10, %ne5 ; original instruction > > After: > %or.eq8.gt10 = .. ; [uses=0] > %or.eq8.ne5 = %ne5 ; instcombine recursively simplified this [uses=1 > or 0 see next] > %res = or %or.eq8.ne5, %gt10 ; even better: %res = or %ne5, %gt10 > > The recursive call of A|C would also let C see further "or"s, so if A > is actually composed of (a1|a2), it could potentially try to simplify > a1|C and a2|C. > > I'm not entirely sure right now, but potentially there could be issues > of creating too many additional "or" instructions. E.g., %or.eq.8.gt10 > originally had more than 1 use. > > Am I on the right track (or does LLVM already support this in another > optimization pass?) > > Ed >-- Ed -------------- next part -------------- A non-text attachment was scrubbed... Name: or.commute.patch Type: text/x-patch Size: 3090 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20081008/80069501/attachment.bin>
Edward Lee
2008-Oct-09 02:54 UTC
[LLVMdev] [PATCH] Lost instcombine opportunity: "or"s of "icmp"s (commutability)
Oh. Seems like you can get something similar by running -instcombine twice but stick in a -reassociate in between the two. ;) Hmm.. I wonder how much faster/smaller would instcombine get without any associative/commutative code compared to running it twice with -reassociate. Ed On Wed, Oct 8, 2008 at 5:58 PM, Edward Lee <eslee3 at uiuc.edu> wrote:> Here's an initial stab, but I'm not too happy about the temporarily > adding new instructions then removing it because returning it will > have it added back in to replace other uses. I also added a couple > test cases pass with the new InstructionCombining changes (the old > code only passes one of the added tests). > > Also, this change exposes some simplification for > test/Transforms/InstCombine/2006-05-06-Infloop.ll. The original code > truncates two i32 values to i8 and eventually ORs them together. The > patch has it optimized to OR the two values first then truncate a > single value. > > %tmp264not = xor i8 %tmp264, -1 ; <i8> [#uses=1] > %tmp212 = trunc i32 %iftmp.36.0 to i8 ; <i8> [#uses=2] > %tmp265 = trunc i32 %tmp261 to i8 ; <i8> [#uses=1] > %tmp266 = or i8 %tmp212, %tmp264not ; <i8> [#uses=2] > %tmp268 = or i8 %tmp266, %tmp265 ; <i8> [#uses=1] > > %tmp264not = xor i8 %tmp264, -1 ; <i8> [#uses=2] > %a.c45 = or i32 %iftmp.36.0, %tmp261 ; <i32> [#uses=1] > trunc i32 %a.c45 to i8 ; <i8>:0 [#uses=1] > %tmp268 = or i8 %0, %tmp264not ; <i8> [#uses=1] > > Ed > On Wed, Oct 8, 2008 at 11:59 AM, Edward Lee <eslee3 at uiuc.edu> wrote: >> instcombine can handle certain orders of "icmp"s that are "or"ed together: >> x != 5 OR x > 10 OR x == 8 becomes.. >> x != 5 OR x == 8 becomes.. >> x != 5 >> >> However, a different ordering prevents the simplification: >> x == 8 OR x > 10 OR x != 5 becomes.. >> %or.eq8.gt10 OR x != 5 >> and that can't be simplified because we now have an "or" OR "icmp". >> >> What would I need to implement to restore the commutative property? >> >> Perhaps a first stab would be to take (A|B)|C create two binaryOp A|C >> and B|C and recursively call visitOr on each of them to see if they >> simplify. Using the example above.. >> >> Before: >> %or.eq8.gt10 = .. ; [uses=1] >> %res = or %or.eq8.gt10, %ne5 ; original instruction >> >> After: >> %or.eq8.gt10 = .. ; [uses=0] >> %or.eq8.ne5 = %ne5 ; instcombine recursively simplified this [uses=1 >> or 0 see next] >> %res = or %or.eq8.ne5, %gt10 ; even better: %res = or %ne5, %gt10 >> >> The recursive call of A|C would also let C see further "or"s, so if A >> is actually composed of (a1|a2), it could potentially try to simplify >> a1|C and a2|C. >> >> I'm not entirely sure right now, but potentially there could be issues >> of creating too many additional "or" instructions. E.g., %or.eq.8.gt10 >> originally had more than 1 use. >> >> Am I on the right track (or does LLVM already support this in another >> optimization pass?) >> >> Ed >> > > > > -- > Ed >-- Ed
Nick Lewycky
2008-Oct-09 06:39 UTC
[LLVMdev] [PATCH] Lost instcombine opportunity: "or"s of "icmp"s (commutability)
Edward Lee wrote:> Here's an initial stab, but I'm not too happy about the temporarily > adding new instructions then removing it because returning it will > have it added back in to replace other uses. I also added a couple > test cases pass with the new InstructionCombining changes (the old > code only passes one of the added tests).Hi Edward. I was wondering whether your optimization could be implemented using the AssociativeOpt framework already in InstCombine? It has a built-in mini-reassociate just for this occasion. Nick> Also, this change exposes some simplification for > test/Transforms/InstCombine/2006-05-06-Infloop.ll. The original code > truncates two i32 values to i8 and eventually ORs them together. The > patch has it optimized to OR the two values first then truncate a > single value. > > %tmp264not = xor i8 %tmp264, -1 ; <i8> [#uses=1] > %tmp212 = trunc i32 %iftmp.36.0 to i8 ; <i8> [#uses=2] > %tmp265 = trunc i32 %tmp261 to i8 ; <i8> [#uses=1] > %tmp266 = or i8 %tmp212, %tmp264not ; <i8> [#uses=2] > %tmp268 = or i8 %tmp266, %tmp265 ; <i8> [#uses=1] > > %tmp264not = xor i8 %tmp264, -1 ; <i8> [#uses=2] > %a.c45 = or i32 %iftmp.36.0, %tmp261 ; <i32> [#uses=1] > trunc i32 %a.c45 to i8 ; <i8>:0 [#uses=1] > %tmp268 = or i8 %0, %tmp264not ; <i8> [#uses=1] > > Ed > On Wed, Oct 8, 2008 at 11:59 AM, Edward Lee <eslee3 at uiuc.edu> wrote: >> instcombine can handle certain orders of "icmp"s that are "or"ed together: >> x != 5 OR x > 10 OR x == 8 becomes.. >> x != 5 OR x == 8 becomes.. >> x != 5 >> >> However, a different ordering prevents the simplification: >> x == 8 OR x > 10 OR x != 5 becomes.. >> %or.eq8.gt10 OR x != 5 >> and that can't be simplified because we now have an "or" OR "icmp". >> >> What would I need to implement to restore the commutative property? >> >> Perhaps a first stab would be to take (A|B)|C create two binaryOp A|C >> and B|C and recursively call visitOr on each of them to see if they >> simplify. Using the example above.. >> >> Before: >> %or.eq8.gt10 = .. ; [uses=1] >> %res = or %or.eq8.gt10, %ne5 ; original instruction >> >> After: >> %or.eq8.gt10 = .. ; [uses=0] >> %or.eq8.ne5 = %ne5 ; instcombine recursively simplified this [uses=1 >> or 0 see next] >> %res = or %or.eq8.ne5, %gt10 ; even better: %res = or %ne5, %gt10 >> >> The recursive call of A|C would also let C see further "or"s, so if A >> is actually composed of (a1|a2), it could potentially try to simplify >> a1|C and a2|C. >> >> I'm not entirely sure right now, but potentially there could be issues >> of creating too many additional "or" instructions. E.g., %or.eq.8.gt10 >> originally had more than 1 use. >> >> Am I on the right track (or does LLVM already support this in another >> optimization pass?) >> >> Ed >> > > > > > ------------------------------------------------------------------------ > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev