thr3ads.net - llvm dev - [LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation [May 2012]

If this information is useful, please help other people find it:
Share via:

dag at cray.com

2012-May-07 22:14 UTC

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

Tobias Grosser <tobias at grosser.es> writes:
>> I forgot to address this one.  With current OpenCL and CUDA
>> specifications, there's no need to do multiple .o files.  In my
mind,
>> llc should output one .o (one .s, etc.).  Anything else wreaks havoc on
>> build systems.
>
> Yes, that's what I am advocating for. There is no need for all this
> complexity. Both standards store the embedded code as a string in the
> host module. That is exactly what the llvm.codegen intrinsic
> models. It requires zero further changes to the code generation
> backend.
But why do you need an intrinsic to do that?  Just generate the code to
a file and suck it into a string, maybe with an external "linker"
tool.

If you just want something to work, that should be sufficient.  If you
want some long-term design/implementation I don't think llvm.codegen is
it.
> In contrast, extending LLVM-IR to support heterogeneous modules
> requires us to add logic to the llvm code generation that knows how to
> link the different sub-modules.
We already have the Linker.

                                -Dave

Tobias Grosser

2012-May-08 09:20 UTC

head link

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

On 05/08/2012 12:14 AM, dag at cray.com wrote:> Tobias Grosser<tobias at grosser.es>  writes:
>
>>> I forgot to address this one.  With current OpenCL and CUDA
>>> specifications, there's no need to do multiple .o files.  In my
mind,
>>> llc should output one .o (one .s, etc.).  Anything else wreaks
havoc on
>>> build systems.
>>
>> Yes, that's what I am advocating for. There is no need for all this
>> complexity. Both standards store the embedded code as a string in the
>> host module. That is exactly what the llvm.codegen intrinsic
>> models. It requires zero further changes to the code generation
>> backend.
>
> But why do you need an intrinsic to do that?  Just generate the code to
> a file and suck it into a string, maybe with an external "linker"
tool.
>
> If you just want something to work, that should be sufficient.  If you
> want some long-term design/implementation I don't think llvm.codegen is
> it.
OK. I think we are on the same track. Yes, there is no need for a lot of 
infrastructure. Storing PTX in a string of the host module, is the only 
thing needed.

So why the intrinsic? I want to create the PTX string from an LLVM-IR 
optimizer pass, that should be loaded into clang, dragonegg, opt, ..
An LLVM-IR optimizer pass does not have access to the file system and it 
can not link to the LLVM back ends to directly create PTX. Creating PTX 
in an optimizer pass would be an ugly hack. The cleaner solution is to 
store an LLVM-IR string in the host module and to mark it with the 
llvm.codegen() intrinsic. When the module is processed by the backend, 
the string is automatically translated to PTX. This requires no 
additional file writing, introduces no layering violations and seems to 
be very simple.

I don't see a better way to translate LLVM-IR to PTX. Do you stil 
believe introducing file writing to an optimizer module is a good and 
portable solution?

Cheers
Tobi

Justin Holewinski

2012-May-08 15:08 UTC

head link

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

On Tue, May 8, 2012 at 2:20 AM, Tobias Grosser <tobias at grosser.es>
wrote:
> On 05/08/2012 12:14 AM, dag at cray.com wrote:
>
>> Tobias Grosser<tobias at grosser.es>  writes:
>>
>>  I forgot to address this one.  With current OpenCL and CUDA
>>>> specifications, there's no need to do multiple .o files. 
In my mind,
>>>> llc should output one .o (one .s, etc.).  Anything else wreaks
havoc on
>>>> build systems.
>>>>
>>>
>>> Yes, that's what I am advocating for. There is no need for all
this
>>> complexity. Both standards store the embedded code as a string in
the
>>> host module. That is exactly what the llvm.codegen intrinsic
>>> models. It requires zero further changes to the code generation
>>> backend.
>>>
>>
>> But why do you need an intrinsic to do that?  Just generate the code to
>> a file and suck it into a string, maybe with an external
"linker" tool.
>>
>> If you just want something to work, that should be sufficient.  If you
>> want some long-term design/implementation I don't think
llvm.codegen is
>> it.
>>
>
> OK. I think we are on the same track. Yes, there is no need for a lot of
> infrastructure. Storing PTX in a string of the host module, is the only
> thing needed.
>
> So why the intrinsic? I want to create the PTX string from an LLVM-IR
> optimizer pass, that should be loaded into clang, dragonegg, opt, ..
> An LLVM-IR optimizer pass does not have access to the file system and it
> can not link to the LLVM back ends to directly create PTX. Creating PTX in
> an optimizer pass would be an ugly hack. The cleaner solution is to store
> an LLVM-IR string in the host module and to mark it with the llvm.codegen()
> intrinsic. When the module is processed by the backend, the string is
> automatically translated to PTX. This requires no additional file writing,
> introduces no layering violations and seems to be very simple.
>
> I don't see a better way to translate LLVM-IR to PTX. Do you stil
believe
> introducing file writing to an optimizer module is a good and portable
> solution?
>
Until any new infrastructure is implemented, I don't see it being any worse
of a solution.  Don't get me wrong, I think the llvm.codegen() intrinsic is
a fast way to get things up and running for the GSoC project; but I also
agree with Dan and Evan that it's not appropriate for LLVM mainline. There
are just too many subtle details and this really only handles the case of
host code needing the device code as text assembly.

To support opt-level transforms, you could just embed the generated IR as
text in the module, then invoke a separate tool to extract that you into a
separate module.  The more I think about this, the more I become convinced
that we could benefit from a module "container," similar to a Mac
fat/universal binary.  Something like this probably wouldn't be too hard to
implement; the main problem I see if what llc outputs, or maybe a single
llc invocation would only process one module in the container.



>
> Cheers
> Tobi
>


-- 

Thanks,

Justin Holewinski
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20120508/d2eb8008/attachment.html>

dag at cray.com

2012-May-08 16:29 UTC

head link

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

Tobias Grosser <tobias at grosser.es> writes:
> So why the intrinsic? I want to create the PTX string from an LLVM-IR
> optimizer pass, that should be loaded into clang, dragonegg, opt, ..
You want to codegen in the optimizer?  I'm confused.
> An LLVM-IR optimizer pass does not have access to the file system and
> it can not link to the LLVM back ends to directly create PTX. Creating
> PTX in an optimizer pass would be an ugly hack.
So you _don't_ want to codegen in the optimizer.  Now I'm really
confused.
> The cleaner solution is to store an LLVM-IR string in the host module
> and to mark it with the llvm.codegen() intrinsic. When the module is
> processed by the backend, the string is automatically translated to
> PTX. This requires no additional file writing, introduces no layering
> violations and seems to be very simple.
Why do you need to store IR in a string?  It's already in the IR file or
you can put it into another file.  All you need is an _external_ tool to
drive llc to process and codegen these multiple files (to multiple
targets) and then another tool to suck up the accelerator code into a
string in the host assembly file.  Then you assemble into an object.

No IR changes and you end up with one object file.  No changes to build
systems at all, it's all handled by a driver.

llvm.codegen is completely unnecessary.

                                 -Dave

Apparently Analagous Threads

Search for more maybe matching threads

llvm dev - May 2012 - [LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

Apparently Analagous Threads