Yuan Lin via llvm-dev
2018-Jul-03 18:06 UTC
[llvm-dev] Question about canonicalizing cmp+select
Hi, Sanjay/all, I noticed in rL331486 that some compare-select optimizations are disabled in favor of providing canonicalized cmp+select to the backend. I am currently working on a private backend target, and the target has a small code size limit. With this change, some of the apps went over the codesize limit. As an example, C code: b = (a > -1) ? 4 : 5; ll code: Before rL331486: %0 = lshr i16 %a.0.a.0., 15 %1 = or i16 %0, 4 After rL331486: %cmp = icmp sgt i16 %a.0.a.0., -1 %cond = select i1 %cmp, i16 4, i16 5 With the various encoding restrictions of my particular target, the cmp/select generated slightly larger code size. However, because the apps were very close to the code size limit, this slight change pushed them over the limit. If I still prefer to have this optimization performed, then is my best strategy moving forward to implement this optimization as a peephole opt in my backend? Thanks, --Yuan -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180703/02716f3d/attachment.html>
Roman Lebedev via llvm-dev
2018-Jul-03 18:13 UTC
[llvm-dev] Question about canonicalizing cmp+select
On Tue, Jul 3, 2018 at 9:06 PM, Yuan Lin via llvm-dev <llvm-dev at lists.llvm.org> wrote:> Hi, Sanjay/all, > > I noticed in rL331486 that some compare-select optimizations are disabled > in favor of providing canonicalized cmp+select to the backend. > > I am currently working on a private backend target, and the target has a > small code size limit. With this change, some of the apps went over the > codesize limit. As an example, > > C code: > b = (a > -1) ? 4 : 5; > > ll code: > Before rL331486: > %0 = lshr i16 %a.0.a.0., 15 > %1 = or i16 %0, 4 > > After rL331486: > %cmp = icmp sgt i16 %a.0.a.0., -1 > %cond = select i1 %cmp, i16 4, i16 5 > > With the various encoding restrictions of my particular target, the > cmp/select generated slightly larger code size. However, because the apps > were very close to the code size limit, this slight change pushed them over > the limit. > > If I still prefer to have this optimization performed, then is my best > strategy moving forward to implement this optimization as a peephole opt in > my backend?I personally think it should probably be a DAGCombine transform, driven by a Target Transform Info hook (e.g. like hasAndNot()).> Thanks, > --YuanRoman.> _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >
Sanjay Patel via llvm-dev
2018-Jul-03 18:22 UTC
[llvm-dev] Question about canonicalizing cmp+select
Do you run DAGCombiner? And are you overriding TLI.convertSelectOfConstantsToMath(VT) for your target? For the stated example (true val and false val constants in the select differ by 1), that should already be converted. On Tue, Jul 3, 2018 at 12:13 PM, Roman Lebedev <lebedev.ri at gmail.com> wrote:> On Tue, Jul 3, 2018 at 9:06 PM, Yuan Lin via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > Hi, Sanjay/all, > > > > I noticed in rL331486 that some compare-select optimizations are > disabled > > in favor of providing canonicalized cmp+select to the backend. > > > > I am currently working on a private backend target, and the target has > a > > small code size limit. With this change, some of the apps went over the > > codesize limit. As an example, > > > > C code: > > b = (a > -1) ? 4 : 5; > > > > ll code: > > Before rL331486: > > %0 = lshr i16 %a.0.a.0., 15 > > %1 = or i16 %0, 4 > > > > After rL331486: > > %cmp = icmp sgt i16 %a.0.a.0., -1 > > %cond = select i1 %cmp, i16 4, i16 5 > > > > With the various encoding restrictions of my particular target, the > > cmp/select generated slightly larger code size. However, because the > apps > > were very close to the code size limit, this slight change pushed them > over > > the limit. > > > > If I still prefer to have this optimization performed, then is my best > > strategy moving forward to implement this optimization as a peephole opt > in > > my backend? > I personally think it should probably be a DAGCombine transform, > driven by a Target Transform Info hook (e.g. like hasAndNot()). > > > Thanks, > > --Yuan > Roman. > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180703/91644269/attachment.html>
Sanjay Patel via llvm-dev
2018-Jul-03 22:39 UTC
[llvm-dev] Question about canonicalizing cmp+select
[adding back llvm-dev and cc'ing Craig] I think you are asking if we are missing a fold (or your target is missing enabling another hook) to transform the sext+add into shift+or? I think the answer is 'yes'. We probably should add that fold. This seems like a similar case as the recent: https://reviews.llvm.org/D48466 Note that on x86, the sext+add becomes zext+sub: t20: i8 = setcc t3, Constant:i16<-1>, setgt:ch t24: i16 = zero_extend t20 t17: i16 = sub Constant:i16<5>, t24 Would that transform help your target? On Tue, Jul 3, 2018 at 3:55 PM, Yuan Lin <yualin at google.com> wrote:> Hi, Roman and Sanjay, > > Thank you for your reply! We currently do run DAGCombiner, but didn't > implement this specific transformation. I just tried turning on > convertSelectOfCosntantsToMath() in our ISelLowering, but that doesn't > quite work because it generated a sign_extend op from i1 to i16, which our > backend currently doesn't support. > > Does the DAGCombiner already has this transformation implemented? > > Thanks, > --Yuan > > On Tue, Jul 3, 2018 at 11:22 AM Sanjay Patel <spatel at rotateright.com> > wrote: > >> Do you run DAGCombiner? And are you overriding TLI. >> convertSelectOfConstantsToMath(VT) for your target? >> >> For the stated example (true val and false val constants in the select >> differ by 1), that should already be converted. >> >> >> On Tue, Jul 3, 2018 at 12:13 PM, Roman Lebedev <lebedev.ri at gmail.com> >> wrote: >> >>> On Tue, Jul 3, 2018 at 9:06 PM, Yuan Lin via llvm-dev >>> <llvm-dev at lists.llvm.org> wrote: >>> > Hi, Sanjay/all, >>> > >>> > I noticed in rL331486 that some compare-select optimizations are >>> disabled >>> > in favor of providing canonicalized cmp+select to the backend. >>> > >>> > I am currently working on a private backend target, and the target >>> has a >>> > small code size limit. With this change, some of the apps went over >>> the >>> > codesize limit. As an example, >>> > >>> > C code: >>> > b = (a > -1) ? 4 : 5; >>> > >>> > ll code: >>> > Before rL331486: >>> > %0 = lshr i16 %a.0.a.0., 15 >>> > %1 = or i16 %0, 4 >>> > >>> > After rL331486: >>> > %cmp = icmp sgt i16 %a.0.a.0., -1 >>> > %cond = select i1 %cmp, i16 4, i16 5 >>> > >>> > With the various encoding restrictions of my particular target, the >>> > cmp/select generated slightly larger code size. However, because the >>> apps >>> > were very close to the code size limit, this slight change pushed them >>> over >>> > the limit. >>> > >>> > If I still prefer to have this optimization performed, then is my >>> best >>> > strategy moving forward to implement this optimization as a peephole >>> opt in >>> > my backend? >>> I personally think it should probably be a DAGCombine transform, >>> driven by a Target Transform Info hook (e.g. like hasAndNot()). >>> >>> > Thanks, >>> > --Yuan >>> Roman. >>> >>> > _______________________________________________ >>> > LLVM Developers mailing list >>> > llvm-dev at lists.llvm.org >>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> > >>> >> >>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180703/1f52c4ac/attachment.html>
Sanjay Patel via llvm-dev
2018-Jul-03 22:44 UTC
[llvm-dev] Question about canonicalizing cmp+select
I linked the wrong patch review. Here's the patch that was actually committed: https://reviews.llvm.org/D48508 https://reviews.llvm.org/rL335433 On Tue, Jul 3, 2018 at 4:39 PM, Sanjay Patel <spatel at rotateright.com> wrote:> [adding back llvm-dev and cc'ing Craig] > > I think you are asking if we are missing a fold (or your target is missing > enabling another hook) to transform the sext+add into shift+or? I think the > answer is 'yes'. We probably should add that fold. This seems like a > similar case as the recent: https://reviews.llvm.org/D48466 > > Note that on x86, the sext+add becomes zext+sub: > t20: i8 = setcc t3, Constant:i16<-1>, setgt:ch > t24: i16 = zero_extend t20 > t17: i16 = sub Constant:i16<5>, t24 > > Would that transform help your target? > > On Tue, Jul 3, 2018 at 3:55 PM, Yuan Lin <yualin at google.com> wrote: > >> Hi, Roman and Sanjay, >> >> Thank you for your reply! We currently do run DAGCombiner, but didn't >> implement this specific transformation. I just tried turning on >> convertSelectOfCosntantsToMath() in our ISelLowering, but that doesn't >> quite work because it generated a sign_extend op from i1 to i16, which our >> backend currently doesn't support. >> >> Does the DAGCombiner already has this transformation implemented? >> >> Thanks, >> --Yuan >> >> On Tue, Jul 3, 2018 at 11:22 AM Sanjay Patel <spatel at rotateright.com> >> wrote: >> >>> Do you run DAGCombiner? And are you overriding >>> TLI.convertSelectOfConstantsToMath(VT) for your target? >>> >>> For the stated example (true val and false val constants in the select >>> differ by 1), that should already be converted. >>> >>> >>> On Tue, Jul 3, 2018 at 12:13 PM, Roman Lebedev <lebedev.ri at gmail.com> >>> wrote: >>> >>>> On Tue, Jul 3, 2018 at 9:06 PM, Yuan Lin via llvm-dev >>>> <llvm-dev at lists.llvm.org> wrote: >>>> > Hi, Sanjay/all, >>>> > >>>> > I noticed in rL331486 that some compare-select optimizations are >>>> disabled >>>> > in favor of providing canonicalized cmp+select to the backend. >>>> > >>>> > I am currently working on a private backend target, and the target >>>> has a >>>> > small code size limit. With this change, some of the apps went over >>>> the >>>> > codesize limit. As an example, >>>> > >>>> > C code: >>>> > b = (a > -1) ? 4 : 5; >>>> > >>>> > ll code: >>>> > Before rL331486: >>>> > %0 = lshr i16 %a.0.a.0., 15 >>>> > %1 = or i16 %0, 4 >>>> > >>>> > After rL331486: >>>> > %cmp = icmp sgt i16 %a.0.a.0., -1 >>>> > %cond = select i1 %cmp, i16 4, i16 5 >>>> > >>>> > With the various encoding restrictions of my particular target, the >>>> > cmp/select generated slightly larger code size. However, because the >>>> apps >>>> > were very close to the code size limit, this slight change pushed >>>> them over >>>> > the limit. >>>> > >>>> > If I still prefer to have this optimization performed, then is my >>>> best >>>> > strategy moving forward to implement this optimization as a peephole >>>> opt in >>>> > my backend? >>>> I personally think it should probably be a DAGCombine transform, >>>> driven by a Target Transform Info hook (e.g. like hasAndNot()). >>>> >>>> > Thanks, >>>> > --Yuan >>>> Roman. >>>> >>>> > _______________________________________________ >>>> > LLVM Developers mailing list >>>> > llvm-dev at lists.llvm.org >>>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> > >>>> >>> >>> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180703/3f9bbe36/attachment.html>