dag at cray.com
2012-May-07 16:07 UTC
[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
Tobias Grosser <tobias at grosser.es> writes:>> Doesn't LLVM support taking the address of a function in another address >> space? If not it probably should. > > Hi Dave, > > I highly appreciate your idea of integrating heterogeneous computing > features directly into LLVM-IR. I believe this can be a way worth > going, but I doubt now is the right moment for it. I don't share your > opinion that it is easy to move LLVM-IR in this direction, but I > rather believe that this is an engineering project that will take > several months of full time work. Possibly not the implementation > itself, but designing it, discussing it, implementing it and ensuring > that the new feature does not increase run-time and memory footprint > or reduce maintainability of LLVM. Due to the large amount of changes > that would be needed all over LLVM, I really think we should first get > some experience in this area, before we burn this feature into > LLVM-IR.I'm not advocating that we rush into this by any means. I'm well aware that the discussions and experiments will take quite a while to plow through. I think a small set of enhancements will go a long way. I'd like to avoid hacking in special intrinsics like llvm.codegen that feel so very much in opposition to the rest of the LLVM design.> The llvm.codegen intrinsic seems the perfect match to build up such > experience. It requires no changes to LLVM-IR itself and only very > local changes to the generic back end infrastructure. It may possibly > not be as generic as other solutions, but it is far from being an ugly > hack. Quite in contrast, it is a close match for OpenCL like run times > and works well with the existing PTX back end.I'll bite my tongue on the designs of OpenCL and CUDA. :) But regardless, if those are your targets you don't need llvm.codegen at all.> Do you have definitiv plans to add heterogeneous computing > capabilities to LLVM-IR within the next couple (3-4) months? Will > these capabilities superseed the llvm codegen intrinsic?No specific plans to change the IR. We have not found a need such changes on current architectures as the runtimes provided with those architectures handles the ugly details. I am thinking further into the future and what might be needed there.> In case such plans do not exist, what do you think about adding the > llvm.codegen() intrinsic for now? If mid-term plans exist for > heterogeneous extensions to LLVM-IR, we can document them along the > intrinsic.I think it's completely unnecessary if your goal is to get something working on current hardware. We do have certaint structural/software engineeering changes to the implementation of LLVM's code generator that would be useful. This primarily is the ability to completely process one function before moving onto the next. This is important when dealing with heterogeneous systems as one has to for example write out different asm for the various targets at a function granularity. But that doesn't require any IR changes whatsoever. -Dave
Tobias Grosser
2012-May-07 21:06 UTC
[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
On 05/07/2012 06:07 PM, dag at cray.com wrote:> Tobias Grosser<tobias at grosser.es> writes: > >>> Doesn't LLVM support taking the address of a function in another address >>> space? If not it probably should. >> >> Hi Dave, >> The llvm.codegen intrinsic seems the perfect match to build up such >> experience. It requires no changes to LLVM-IR itself and only very >> local changes to the generic back end infrastructure. It may possibly >> not be as generic as other solutions, but it is far from being an ugly >> hack. Quite in contrast, it is a close match for OpenCL like run times >> and works well with the existing PTX back end. > > I'll bite my tongue on the designs of OpenCL and CUDA. :) > > But regardless, if those are your targets you don't need llvm.codegen at > all.Why is it not needed? I don't see anything that could currently replace it. How can I create a loadable optimizer module that creates embedded PTX code without the llvm.codegen intrinsic?>> Do you have definitiv plans to add heterogeneous computing >> capabilities to LLVM-IR within the next couple (3-4) months? Will >> these capabilities superseed the llvm codegen intrinsic? > > No specific plans to change the IR. We have not found a need such > changes on current architectures as the runtimes provided with those > architectures handles the ugly details. I am thinking further into the > future and what might be needed there.OK. I am talking about something that is available within the next weeks in LLVM.>> In case such plans do not exist, what do you think about adding the >> llvm.codegen() intrinsic for now? If mid-term plans exist for >> heterogeneous extensions to LLVM-IR, we can document them along the >> intrinsic. > > I think it's completely unnecessary if your goal is to get something > working on current hardware.Again, why is it unnecessary?> We do have certaint structural/software engineeering changes to the > implementation of LLVM's code generator that would be useful. This > primarily is the ability to completely process one function before > moving onto the next. This is important when dealing with heterogeneous > systems as one has to for example write out different asm for the > various targets at a function granularity. But that doesn't require any > IR changes whatsoever.At least for CUDA/OpenCL the modules are entirely independent. Is such a fine granularity realy required? Tobi
dag at cray.com
2012-May-07 22:12 UTC
[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
Tobias Grosser <tobias at grosser.es> writes:>> But regardless, if those are your targets you don't need llvm.codegen at >> all. > > Why is it not needed? I don't see anything that could currently > replace it. How can I create a loadable optimizer module that creates > embedded PTX code without the llvm.codegen intrinsic?Embed the PTX as a string in the x86 object/executable. This requires that the AsmPrinter can be directed to multiple files but it doesn't require any IR changes at all. Actually, since your modules are independent it doesn't even require AsmPrinter changes.>> No specific plans to change the IR. We have not found a need such >> changes on current architectures as the runtimes provided with those >> architectures handles the ugly details. I am thinking further into the >> future and what might be needed there. > > OK. I am talking about something that is available within the next > weeks in LLVM.Then you don't need a special intrinsic.>> I think it's completely unnecessary if your goal is to get something >> working on current hardware. > > Again, why is it unnecessary?See above.>> We do have certaint structural/software engineeering changes to the >> implementation of LLVM's code generator that would be useful. This >> primarily is the ability to completely process one function before >> moving onto the next. This is important when dealing with heterogeneous >> systems as one has to for example write out different asm for the >> various targets at a function granularity. But that doesn't require any >> IR changes whatsoever. > > At least for CUDA/OpenCL the modules are entirely independent. Is such > a fine granularity realy required?If they're independent, no. In our case the (to us) frontend extracts kernels and send them to codegen they same way it sends x86 code. Originally I did this for scalability purposes. We could not compile very large codes when LLVM insisted we keep all the IR around all the time. We had to get rid of that restriction which led to the function-at-a-time model. It just happens that it works well for current accelerators. -Dave
Apparently Analagous Threads
- [LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
- [LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
- [LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
- [LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
- [LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation