dag at cray.com
2012-May-01 21:21 UTC
[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
Justin Holewinski <justin.holewinski at gmail.com> writes:> For something like PTX, runtime calls take care of the call semantics so > it is either up to the user or the frontend to set up the runtime calls > correctly. We don't need to completely solve this problem. Yet. :) > > But there has to be some interface that allows an LLVM IR function > from one architecture to get at the code or name of a function from > another architecture. This could be handled in the front-end, but it > seems like we could design some abstraction.Doesn't LLVM support taking the address of a function in another address space? If not it probably should.> > If you have a global variable, what target "sees" it? Does it need to > > be annotated along with the function? > > Sorry, I meant global variables in the LLVM IR. Are they valid for > only one architecture in the IR module?Ah. It very much depends on the system architecture. Since current PTX targets run in an entirely separate address space globals would have to be replicated and copied to/from the device. This might require target-specific modules. For a system with shared memory, I would assume the globals could simply be shared "as usual." Otherwise, it wouldn't be shared memory. In a target-specific module design, one or the other would be an extern reference.> > If you're targeting Cell, in contrast, you'd want to compile both down > > to object files. > > I think we probably want to do that for PTX as well. > > Maybe, maybe not. It may make sense to rely on run-time JIT'ing of the PTX.That happens regardless. There is no way to produce instructions "to the metal" for NVIDIA targets. I was referring to PTX object files above.> Do we allow more than one Module per file? If not, that seems like an > arbitrary limitation. If we allowed that we could have each module > specify a different target. > > That could work.Given your questions about globals above, I think it might be a requirement unless we want to require code for separate targets live in separate files. I think that's too restrictive because some opt pass might want to extract kernels and put them on separate targets. -Dave
Tobias Grosser
2012-May-07 08:15 UTC
[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
On 05/01/2012 11:21 PM, dag at cray.com wrote:> Justin Holewinski<justin.holewinski at gmail.com> writes: > >> For something like PTX, runtime calls take care of the call semantics so >> it is either up to the user or the frontend to set up the runtime calls >> correctly. We don't need to completely solve this problem. Yet. :) >> >> But there has to be some interface that allows an LLVM IR function >> from one architecture to get at the code or name of a function from >> another architecture. This could be handled in the front-end, but it >> seems like we could design some abstraction. > > Doesn't LLVM support taking the address of a function in another address > space? If not it probably should.Hi Dave, I highly appreciate your idea of integrating heterogeneous computing features directly into LLVM-IR. I believe this can be a way worth going, but I doubt now is the right moment for it. I don't share your opinion that it is easy to move LLVM-IR in this direction, but I rather believe that this is an engineering project that will take several months of full time work. Possibly not the implementation itself, but designing it, discussing it, implementing it and ensuring that the new feature does not increase run-time and memory footprint or reduce maintainability of LLVM. Due to the large amount of changes that would be needed all over LLVM, I really think we should first get some experience in this area, before we burn this feature into LLVM-IR. The llvm.codegen intrinsic seems the perfect match to build up such experience. It requires no changes to LLVM-IR itself and only very local changes to the generic back end infrastructure. It may possibly not be as generic as other solutions, but it is far from being an ugly hack. Quite in contrast, it is a close match for OpenCL like run times and works well with the existing PTX back end. Do you have definitiv plans to add heterogeneous computing capabilities to LLVM-IR within the next couple (3-4) months? Will these capabilities superseed the llvm codegen intrinsic? In case such plans do not exist, what do you think about adding the llvm.codegen() intrinsic for now? If mid-term plans exist for heterogeneous extensions to LLVM-IR, we can document them along the intrinsic. Cheers Tobi
Tobias Grosser
2012-May-07 08:31 UTC
[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
On 05/01/2012 11:21 PM, dag at cray.com wrote:> Justin Holewinski<justin.holewinski at gmail.com> writes: >> Do we allow more than one Module per file? If not, that seems like an >> arbitrary limitation. If we allowed that we could have each module >> specify a different target. >> >> That could work. > > Given your questions about globals above, I think it might be a > requirement unless we want to require code for separate targets live in > separate files. I think that's too restrictive because some opt pass > might want to extract kernels and put them on separate targets.I think we need several modules per file. Supporting AMDIL and PTX at the same time sounds more than useful. Another question that pops up to me. If we go support several modules, how would the command line options to opt look like? Do we want to make all options sub-module specific? Getting this user friendly may be difficult. The same for the output of llc. At the moment llc can dump the assembly to stdout. Would you dump the assembly of the different modules to stdout or do you want to support multiple -o options to specify the various output files? The same for the LLVM CodeGen/Target API. It must possibly be changed to support the output of several modules or the specification of different options for each module. We also have the same problems as Justin pointed out for the codegen intrinsic. Some llc options are globals, they would need to be made Codegen options, if we want to set them on a per-module basis. Cheers Tobi
dag at cray.com
2012-May-07 16:07 UTC
[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
Tobias Grosser <tobias at grosser.es> writes:>> Doesn't LLVM support taking the address of a function in another address >> space? If not it probably should. > > Hi Dave, > > I highly appreciate your idea of integrating heterogeneous computing > features directly into LLVM-IR. I believe this can be a way worth > going, but I doubt now is the right moment for it. I don't share your > opinion that it is easy to move LLVM-IR in this direction, but I > rather believe that this is an engineering project that will take > several months of full time work. Possibly not the implementation > itself, but designing it, discussing it, implementing it and ensuring > that the new feature does not increase run-time and memory footprint > or reduce maintainability of LLVM. Due to the large amount of changes > that would be needed all over LLVM, I really think we should first get > some experience in this area, before we burn this feature into > LLVM-IR.I'm not advocating that we rush into this by any means. I'm well aware that the discussions and experiments will take quite a while to plow through. I think a small set of enhancements will go a long way. I'd like to avoid hacking in special intrinsics like llvm.codegen that feel so very much in opposition to the rest of the LLVM design.> The llvm.codegen intrinsic seems the perfect match to build up such > experience. It requires no changes to LLVM-IR itself and only very > local changes to the generic back end infrastructure. It may possibly > not be as generic as other solutions, but it is far from being an ugly > hack. Quite in contrast, it is a close match for OpenCL like run times > and works well with the existing PTX back end.I'll bite my tongue on the designs of OpenCL and CUDA. :) But regardless, if those are your targets you don't need llvm.codegen at all.> Do you have definitiv plans to add heterogeneous computing > capabilities to LLVM-IR within the next couple (3-4) months? Will > these capabilities superseed the llvm codegen intrinsic?No specific plans to change the IR. We have not found a need such changes on current architectures as the runtimes provided with those architectures handles the ugly details. I am thinking further into the future and what might be needed there.> In case such plans do not exist, what do you think about adding the > llvm.codegen() intrinsic for now? If mid-term plans exist for > heterogeneous extensions to LLVM-IR, we can document them along the > intrinsic.I think it's completely unnecessary if your goal is to get something working on current hardware. We do have certaint structural/software engineeering changes to the implementation of LLVM's code generator that would be useful. This primarily is the ability to completely process one function before moving onto the next. This is important when dealing with heterogeneous systems as one has to for example write out different asm for the various targets at a function granularity. But that doesn't require any IR changes whatsoever. -Dave
dag at cray.com
2012-May-07 16:11 UTC
[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
Tobias Grosser <tobias at grosser.es> writes:> I think we need several modules per file. Supporting AMDIL and PTX at > the same time sounds more than useful.Yes.> Another question that pops up to me. If we go support several modules, > how would the command line options to opt look like? Do we want to > make all options sub-module specific? Getting this user friendly may > be difficult. The same for the output of llc. At the moment llc can > dump the assembly to stdout. Would you dump the assembly of the > different modules to stdout or do you want to support multiple -o > options to specify the various output files?I think you're making this too complicated. I think opt should continue to work the way it does now. Apply the same flags to all modules. If the user wants different transformations based on target either the target characteristics should inform the optimizer or the file should be split into multiple IR files.> The same for the LLVM CodeGen/Target API. It must possibly be changed > to support the output of several modules or the specification of > different options for each module. We also have the same problems as > Justin pointed out for the codegen intrinsic. Some llc options are > globals, they would need to be made Codegen options, if we want to set > them on a per-module basis.Can you give me some examples? What kinds of options would be target-specific and not implied by the target attribute on the Module? -Dave
dag at cray.com
2012-May-07 16:13 UTC
[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
Tobias Grosser <tobias at grosser.es> writes:> Would you dump the assembly of the different modules to stdout or do > you want to support multiple -o options to specify the various output > files?I forgot to address this one. With current OpenCL and CUDA specifications, there's no need to do multiple .o files. In my mind, llc should output one .o (one .s, etc.). Anything else wreaks havoc on build systems. But Chris has the final say, I think. -Dave
Possibly Parallel Threads
- [LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
- [LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
- [LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
- [LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
- [LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation