vivek pandya via llvm-dev
2016-Feb-29 21:00 UTC
[llvm-dev] [GSoC 2016] Code Generation Improvements task
Hello LLVM Community, I am interested doing following project with LLVM for GSoC 2016. Code Generation Improvements: Particularly Generalize target-specific backend passes that could be target-independent I have done some initial study and try to understand the task to be done. Please help me to develop the proposal. Following are my initial findings : 1. lib/Target/Hexagon/RDF* : Code for these pass is mostly target independent so to generalize them code needs to be wrap in MachineFunction pass and then use it as required. And if already not done , Merge Set of SSA based CFG can be computed at time of SSA generation. This can improve performance of Ramakrishna’s algorithm. 2. lib/Target/AArch64/AArch64AddressTypePromotion.cpp As far as I understand this pass promotes sign exertion for 32 bit integer ( address) and performs calculation on 64 bit number thus processes need not switch execution mode to 32 bit. Some other platforms such as MIPS, NVPTX, Sparc can be benefited by such optimization because MIPS64 supports MIPS32 bit instruction and it requires mode switch indicated by control register. For PTX size of pointers depends on the host machine so there might be similar situation. But architectures like Power PC need not such optimization as 64 bit instruction operating in 32 bit mode passes only lower 32 bits. 3. lib/Target/AArch64/AArch64PromoteConstant.cpp This pass tries to simplify aggregate data like struct of const with special SIMD instructions available on the system. For example on ARM its NEON similarly other architectures have SIMD support specifically MIPS, IBM System Z, Power PC with MMX/AltiVee and x86 with Intel’s AVX. Apart from these , The proposal can include task for merging the delay slot filling logic ( from Sparc and Mips ) into single target independent pass. These is just a primary investigation. I am not expert with all architectures supposed by LLVM but MIPS, x86 and to some extent ARM. I have question regarding Target hooks. Does it means using TargetInfo an SubTargetInfo class and at runtime decide architecture type and based on that perform optimization ( i.e use target specific instructions ) ? Please help me ! Am I going in right direction ? Suggest some code , document to look for further ideas. Also if any one like to mentor me for this project. Sincerely, Vivek Pandya -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160301/c25ec6fb/attachment.html>
Tim Northover via llvm-dev
2016-Mar-01 04:53 UTC
[llvm-dev] [GSoC 2016] Code Generation Improvements task
Hi Vivek, (Mostly responding with AArch64 hints, though anything I happen to know from elsewhere too). On 29 February 2016 at 13:00, vivek pandya via llvm-dev <llvm-dev at lists.llvm.org> wrote:> 2. lib/Target/AArch64/AArch64AddressTypePromotion.cpp > As far as I understand this pass promotes sign exertion for 32 bit integer ( > address) and performs calculation on 64 bit number thus processes need not > switch execution mode to 32 bit.Switching execution mode isn't an option on AArch64 (it can only happen with OS support and never happens within a single process on a sane OS). This pass is more a matter of putting the IR in a form that precisely matches the addressing modes that are actually available. AArch64 can encode addresses like "base64 + sext(offset32)" into the actual load/store instruction so it's advantageous to put the sext as close as possible to the pointer dereference. I'm afraid I don't really know enough about other architectures to say which could benefit. It's obviously only beneficial if they have the addressing modes to support it.> 3. lib/Target/AArch64/AArch64PromoteConstant.cpp > This pass tries to simplify aggregate data like struct of const with special > SIMD instructions available on the system. For example on ARM its NEON > similarly other architectures have SIMD support specifically MIPS, IBM > System Z, Power PC with MMX/AltiVee and x86 with Intel’s AVX.Possibly. It seems to rely pretty strongly on ARM's "load more than you can actually use" instructions: vldN instructions can load up to 4 128-bit vectors, but they can still only be used as 128-bit vectors. If other targets possess similar, then they could well benefit; if not, then it's probably pointless.> I have question regarding Target hooks. Does it means using TargetInfo an > SubTargetInfo class and at runtime decide architecture type and based on > that perform optimization ( i.e use target specific instructions ) ?I think they more normally live in TargetTransformInfo.> Please help me ! Am I going in right direction ? Suggest some code , > document to look for further ideas. Also if any one like to mentor me for > this project.It sounds like a plausible direction, but documentation is always lacking in these kinds of things. As a complete outsider to targets with delay slots, merging their logic sounds like a nice improvement to me (especially as Lanai is probably incoming as another ISA that has decided delay slots are a good idea). But (also as an outsider) I have no idea how practical that really is. Cheers. Tim.
vivek pandya via llvm-dev
2016-Mar-01 17:26 UTC
[llvm-dev] [GSoC 2016] Code Generation Improvements task
*Vivek Pandya* On Tue, Mar 1, 2016 at 10:23 AM, Tim Northover <t.p.northover at gmail.com> wrote:> Hi Vivek, > > (Mostly responding with AArch64 hints, though anything I happen to > know from elsewhere too). > > On 29 February 2016 at 13:00, vivek pandya via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > 2. lib/Target/AArch64/AArch64AddressTypePromotion.cpp > > As far as I understand this pass promotes sign exertion for 32 bit > integer ( > > address) and performs calculation on 64 bit number thus processes need > not > > switch execution mode to 32 bit. > > Switching execution mode isn't an option on AArch64 (it can only > happen with OS support and never happens within a single process on a > sane OS). > > This pass is more a matter of putting the IR in a form that precisely > matches the addressing modes that are actually available. AArch64 can > encode addresses like "base64 + sext(offset32)" into the actual > load/store instruction so it's advantageous to put the sext as close > as possible to the pointer dereference. > > I'm afraid I don't really know enough about other architectures to say > which could benefit. It's obviously only beneficial if they have the > addressing modes to support it. > > > 3. lib/Target/AArch64/AArch64PromoteConstant.cpp > > This pass tries to simplify aggregate data like struct of const with > special > > SIMD instructions available on the system. For example on ARM its NEON > > similarly other architectures have SIMD support specifically MIPS, IBM > > System Z, Power PC with MMX/AltiVee and x86 with Intel’s AVX. > > Possibly. It seems to rely pretty strongly on ARM's "load more than > you can actually use" instructions: vldN instructions can load up to 4 > 128-bit vectors, but they can still only be used as 128-bit vectors. > If other targets possess similar, then they could well benefit; if > not, then it's probably pointless. > > > I have question regarding Target hooks. Does it means using TargetInfo an > > SubTargetInfo class and at runtime decide architecture type and based on > > that perform optimization ( i.e use target specific instructions ) ? > > I think they more normally live in TargetTransformInfo. > > > Please help me ! Am I going in right direction ? Suggest some code , > > document to look for further ideas. Also if any one like to mentor me for > > this project. > > It sounds like a plausible direction, but documentation is always > lacking in these kinds of things. > > As a complete outsider to targets with delay slots, merging their > logic sounds like a nice improvement to me (especially as Lanai is > probably incoming as another ISA that has decided delay slots are a > good idea). But (also as an outsider) I have no idea how practical > that really is. >Thanks Tim for providing more insights, I would gather more information in given direction. Further more here mentioned 3 tasks may be not a much work for some one who has a good grasp on llvm but for me it may be sufficient for GSoC duration. It may not be possible for Google to provide fundings for limited number of improvements. So I am thinking to include some TODOs in StackColoring.cpp and StackSlotColoring.cpp in proposal too. Will it be enough to demonstrate in proposal ? Still I am looking for feedback on RDF part and also if some one is willing to mentor me. Sincerely, Vivek> Cheers. > > Tim. >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160301/d19affb2/attachment.html>