Quentin Colombet
2013-Jul-01 18:30 UTC
[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?
Hi, ** Problematic ** I am looking for advices to share some logic between DAG combine and target lowering. Basically, I need to know if a bitcast that is about to be inserted during target specific isel lowering will be eliminated during DAG combine. Let me know if there is another, better supported, approach for this kind of problems. ** Motivating Example ** The motivating example comes form the lowering of vector code on armv7. More specifically, the build_vector node is lowered to a target specific ARMISD::build_vector where all the parameters are bitcasted to floating point types. This works well, unless the inserted bitcasts survive until instruction selection. In that case, they incur moves between integer unit and floating point unit that may result in inefficient code. Attached motivating_example.ll shows such a case: llc -O3 -mtriple thumbv7-apple-ios3 motivating_example.ll -o - ldr r0, [r1] ldr r1, [r2] vmov s1, r1 vmov s0, r0 Here each ldr, vmov sequences could have been replaced by a simple vld1.32. ** Proposed Solution ** Lower to more vector friendly code (using a sequence of insert_vector_elt), when bit casts will not be free. The attached patch demonstrates that, but is missing the proper check to know what DAG combine will do (see TODO). Thanks for your help. Cheers, -Quentin -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130701/c1dd6e5a/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: ARMISelLowering.patch Type: application/octet-stream Size: 1288 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130701/c1dd6e5a/attachment.obj> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130701/c1dd6e5a/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: motivating_example.ll Type: application/octet-stream Size: 1114 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130701/c1dd6e5a/attachment-0001.obj> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130701/c1dd6e5a/attachment-0002.html>
Eli Friedman
2013-Jul-01 18:52 UTC
[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?
On Mon, Jul 1, 2013 at 11:30 AM, Quentin Colombet <qcolombet at apple.com>wrote:> Hi, > > ** Problematic ** > I am looking for advices to share some logic between DAG combine and > target lowering. > > Basically, I need to know if a bitcast that is about to be inserted during > target specific isel lowering will be eliminated during DAG combine. > > Let me know if there is another, better supported, approach for this kind > of problems. > > ** Motivating Example ** > The motivating example comes form the lowering of vector code on armv7. > More specifically, the build_vector node is lowered to a target specific > ARMISD::build_vector where all the parameters are bitcasted to floating > point types. > > This works well, unless the inserted bitcasts survive until instruction > selection. In that case, they incur moves between integer unit and floating > point unit that may result in inefficient code. > > Attached motivating_example.ll shows such a case: > llc -O3 -mtriple thumbv7-apple-ios3 motivating_example.ll -o - > ldr r0, [r1] > ldr r1, [r2] > vmov s1, r1 > vmov s0, r0 > Here each ldr, vmov sequences could have been replaced by a simple vld1.32. > > ** Proposed Solution ** > Lower to more vector friendly code (using a sequence of > insert_vector_elt), when bit casts will not be free. > The attached patch demonstrates that, but is missing the proper check to > know what DAG combine will do (see TODO). >I think you're approaching this backwards: the obvious thing to do is to generate the insert_vector_elt sequence unconditionally, and DAGCombine that sequence to a build_vector when appropriate. -Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130701/e2f1191b/attachment.html>
Quentin Colombet
2013-Jul-01 19:07 UTC
[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?
On Jul 1, 2013, at 11:52 AM, Eli Friedman <eli.friedman at gmail.com> wrote:> On Mon, Jul 1, 2013 at 11:30 AM, Quentin Colombet <qcolombet at apple.com> wrote: > Hi, > > ** Problematic ** > I am looking for advices to share some logic between DAG combine and target lowering. > > Basically, I need to know if a bitcast that is about to be inserted during target specific isel lowering will be eliminated during DAG combine. > > Let me know if there is another, better supported, approach for this kind of problems. > > ** Motivating Example ** > The motivating example comes form the lowering of vector code on armv7. > More specifically, the build_vector node is lowered to a target specific ARMISD::build_vector where all the parameters are bitcasted to floating point types. > > This works well, unless the inserted bitcasts survive until instruction selection. In that case, they incur moves between integer unit and floating point unit that may result in inefficient code. > > Attached motivating_example.ll shows such a case: > llc -O3 -mtriple thumbv7-apple-ios3 motivating_example.ll -o - > ldr r0, [r1] > ldr r1, [r2] > vmov s1, r1 > vmov s0, r0 > Here each ldr, vmov sequences could have been replaced by a simple vld1.32. > > ** Proposed Solution ** > Lower to more vector friendly code (using a sequence of insert_vector_elt), when bit casts will not be free. > The attached patch demonstrates that, but is missing the proper check to know what DAG combine will do (see TODO). > > I think you're approaching this backwards: the obvious thing to do is to generate the insert_vector_elt sequence unconditionally, and DAGCombine that sequence to a build_vector when appropriate.Thanks Eli. I will try that approach. -Quentin> > -Eli-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130701/33178708/attachment.html>
Quentin Colombet
2013-Jul-01 20:33 UTC
[LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?
On Jul 1, 2013, at 11:52 AM, Eli Friedman <eli.friedman at gmail.com> wrote:> On Mon, Jul 1, 2013 at 11:30 AM, Quentin Colombet <qcolombet at apple.com> wrote: > Hi, > > ** Problematic ** > I am looking for advices to share some logic between DAG combine and target lowering. > > Basically, I need to know if a bitcast that is about to be inserted during target specific isel lowering will be eliminated during DAG combine. > > Let me know if there is another, better supported, approach for this kind of problems. > > ** Motivating Example ** > The motivating example comes form the lowering of vector code on armv7. > More specifically, the build_vector node is lowered to a target specific ARMISD::build_vector where all the parameters are bitcasted to floating point types. > > This works well, unless the inserted bitcasts survive until instruction selection. In that case, they incur moves between integer unit and floating point unit that may result in inefficient code. > > Attached motivating_example.ll shows such a case: > llc -O3 -mtriple thumbv7-apple-ios3 motivating_example.ll -o - > ldr r0, [r1] > ldr r1, [r2] > vmov s1, r1 > vmov s0, r0 > Here each ldr, vmov sequences could have been replaced by a simple vld1.32. > > ** Proposed Solution ** > Lower to more vector friendly code (using a sequence of insert_vector_elt), when bit casts will not be free. > The attached patch demonstrates that, but is missing the proper check to know what DAG combine will do (see TODO). > > I think you're approaching this backwards: the obvious thing to do is to generate the insert_vector_elt sequence unconditionally, and DAGCombine that sequence to a build_vector when appropriate.Hi Eli, I have started to look into the direction you gave me. I may have miss something but I do not see how the proposed direction solves the issue. Indeed to be able to DAGCombine a insert_vector_elt sequences into a ARMISD::build_vector, I still need to know if it would be profitable, i.e., if DAGCombine will remove the bitcasts that combining/lowering is about to insert. Since target specific DAGCombine are also done in TargetLowering I do not have access to more DAGCombine logic (at least DAGCombineInfo is not providing the require information). What did I miss? Thanks, -Quentin> > -Eli-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130701/5a04d7c5/attachment.html>
Reasonably Related Threads
- [LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?
- [LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?
- [LLVMdev] Advices Required: Best practice to share logic between DAG combine and target lowering?
- [LLVMdev] Simple NEON optimization
- [LLVMdev] Modeling GPU vector registers, again (with my implementation)