dag at cray.com
2012-May-07 22:14 UTC
[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
Tobias Grosser <tobias at grosser.es> writes:>> I forgot to address this one. With current OpenCL and CUDA >> specifications, there's no need to do multiple .o files. In my mind, >> llc should output one .o (one .s, etc.). Anything else wreaks havoc on >> build systems. > > Yes, that's what I am advocating for. There is no need for all this > complexity. Both standards store the embedded code as a string in the > host module. That is exactly what the llvm.codegen intrinsic > models. It requires zero further changes to the code generation > backend.But why do you need an intrinsic to do that? Just generate the code to a file and suck it into a string, maybe with an external "linker" tool. If you just want something to work, that should be sufficient. If you want some long-term design/implementation I don't think llvm.codegen is it.> In contrast, extending LLVM-IR to support heterogeneous modules > requires us to add logic to the llvm code generation that knows how to > link the different sub-modules.We already have the Linker. -Dave
Tobias Grosser
2012-May-08 09:20 UTC
[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
On 05/08/2012 12:14 AM, dag at cray.com wrote:> Tobias Grosser<tobias at grosser.es> writes: > >>> I forgot to address this one. With current OpenCL and CUDA >>> specifications, there's no need to do multiple .o files. In my mind, >>> llc should output one .o (one .s, etc.). Anything else wreaks havoc on >>> build systems. >> >> Yes, that's what I am advocating for. There is no need for all this >> complexity. Both standards store the embedded code as a string in the >> host module. That is exactly what the llvm.codegen intrinsic >> models. It requires zero further changes to the code generation >> backend. > > But why do you need an intrinsic to do that? Just generate the code to > a file and suck it into a string, maybe with an external "linker" tool. > > If you just want something to work, that should be sufficient. If you > want some long-term design/implementation I don't think llvm.codegen is > it.OK. I think we are on the same track. Yes, there is no need for a lot of infrastructure. Storing PTX in a string of the host module, is the only thing needed. So why the intrinsic? I want to create the PTX string from an LLVM-IR optimizer pass, that should be loaded into clang, dragonegg, opt, .. An LLVM-IR optimizer pass does not have access to the file system and it can not link to the LLVM back ends to directly create PTX. Creating PTX in an optimizer pass would be an ugly hack. The cleaner solution is to store an LLVM-IR string in the host module and to mark it with the llvm.codegen() intrinsic. When the module is processed by the backend, the string is automatically translated to PTX. This requires no additional file writing, introduces no layering violations and seems to be very simple. I don't see a better way to translate LLVM-IR to PTX. Do you stil believe introducing file writing to an optimizer module is a good and portable solution? Cheers Tobi
Justin Holewinski
2012-May-08 15:08 UTC
[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
On Tue, May 8, 2012 at 2:20 AM, Tobias Grosser <tobias at grosser.es> wrote:> On 05/08/2012 12:14 AM, dag at cray.com wrote: > >> Tobias Grosser<tobias at grosser.es> writes: >> >> I forgot to address this one. With current OpenCL and CUDA >>>> specifications, there's no need to do multiple .o files. In my mind, >>>> llc should output one .o (one .s, etc.). Anything else wreaks havoc on >>>> build systems. >>>> >>> >>> Yes, that's what I am advocating for. There is no need for all this >>> complexity. Both standards store the embedded code as a string in the >>> host module. That is exactly what the llvm.codegen intrinsic >>> models. It requires zero further changes to the code generation >>> backend. >>> >> >> But why do you need an intrinsic to do that? Just generate the code to >> a file and suck it into a string, maybe with an external "linker" tool. >> >> If you just want something to work, that should be sufficient. If you >> want some long-term design/implementation I don't think llvm.codegen is >> it. >> > > OK. I think we are on the same track. Yes, there is no need for a lot of > infrastructure. Storing PTX in a string of the host module, is the only > thing needed. > > So why the intrinsic? I want to create the PTX string from an LLVM-IR > optimizer pass, that should be loaded into clang, dragonegg, opt, .. > An LLVM-IR optimizer pass does not have access to the file system and it > can not link to the LLVM back ends to directly create PTX. Creating PTX in > an optimizer pass would be an ugly hack. The cleaner solution is to store > an LLVM-IR string in the host module and to mark it with the llvm.codegen() > intrinsic. When the module is processed by the backend, the string is > automatically translated to PTX. This requires no additional file writing, > introduces no layering violations and seems to be very simple. > > I don't see a better way to translate LLVM-IR to PTX. Do you stil believe > introducing file writing to an optimizer module is a good and portable > solution? >Until any new infrastructure is implemented, I don't see it being any worse of a solution. Don't get me wrong, I think the llvm.codegen() intrinsic is a fast way to get things up and running for the GSoC project; but I also agree with Dan and Evan that it's not appropriate for LLVM mainline. There are just too many subtle details and this really only handles the case of host code needing the device code as text assembly. To support opt-level transforms, you could just embed the generated IR as text in the module, then invoke a separate tool to extract that you into a separate module. The more I think about this, the more I become convinced that we could benefit from a module "container," similar to a Mac fat/universal binary. Something like this probably wouldn't be too hard to implement; the main problem I see if what llc outputs, or maybe a single llc invocation would only process one module in the container.> > Cheers > Tobi >-- Thanks, Justin Holewinski -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120508/d2eb8008/attachment.html>
dag at cray.com
2012-May-08 16:29 UTC
[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
Tobias Grosser <tobias at grosser.es> writes:> So why the intrinsic? I want to create the PTX string from an LLVM-IR > optimizer pass, that should be loaded into clang, dragonegg, opt, ..You want to codegen in the optimizer? I'm confused.> An LLVM-IR optimizer pass does not have access to the file system and > it can not link to the LLVM back ends to directly create PTX. Creating > PTX in an optimizer pass would be an ugly hack.So you _don't_ want to codegen in the optimizer. Now I'm really confused.> The cleaner solution is to store an LLVM-IR string in the host module > and to mark it with the llvm.codegen() intrinsic. When the module is > processed by the backend, the string is automatically translated to > PTX. This requires no additional file writing, introduces no layering > violations and seems to be very simple.Why do you need to store IR in a string? It's already in the IR file or you can put it into another file. All you need is an _external_ tool to drive llc to process and codegen these multiple files (to multiple targets) and then another tool to suck up the accelerator code into a string in the host assembly file. Then you assemble into an object. No IR changes and you end up with one object file. No changes to build systems at all, it's all handled by a driver. llvm.codegen is completely unnecessary. -Dave
Apparently Analagous Threads
- [LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
- [LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
- [LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
- [LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation
- [LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation