thr3ads.net - llvm dev - [LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation [May 2012]

If this information is useful, please help other people find it:
Share via:

dag at cray.com

2012-May-08 16:29 UTC

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

Tobias Grosser <tobias at grosser.es> writes:
> So why the intrinsic? I want to create the PTX string from an LLVM-IR
> optimizer pass, that should be loaded into clang, dragonegg, opt, ..
You want to codegen in the optimizer?  I'm confused.
> An LLVM-IR optimizer pass does not have access to the file system and
> it can not link to the LLVM back ends to directly create PTX. Creating
> PTX in an optimizer pass would be an ugly hack.
So you _don't_ want to codegen in the optimizer.  Now I'm really
confused.
> The cleaner solution is to store an LLVM-IR string in the host module
> and to mark it with the llvm.codegen() intrinsic. When the module is
> processed by the backend, the string is automatically translated to
> PTX. This requires no additional file writing, introduces no layering
> violations and seems to be very simple.
Why do you need to store IR in a string?  It's already in the IR file or
you can put it into another file.  All you need is an _external_ tool to
drive llc to process and codegen these multiple files (to multiple
targets) and then another tool to suck up the accelerator code into a
string in the host assembly file.  Then you assemble into an object.

No IR changes and you end up with one object file.  No changes to build
systems at all, it's all handled by a driver.

llvm.codegen is completely unnecessary.

                                 -Dave

Justin Holewinski

2012-May-08 17:08 UTC

head link

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

On Tue, May 8, 2012 at 9:29 AM, <dag at cray.com> wrote:
> Tobias Grosser <tobias at grosser.es> writes:
>
> > So why the intrinsic? I want to create the PTX string from an LLVM-IR
> > optimizer pass, that should be loaded into clang, dragonegg, opt, ..
>
> You want to codegen in the optimizer?  I'm confused.
> > An LLVM-IR optimizer pass does not have access to the file system and
> > it can not link to the LLVM back ends to directly create PTX. Creating
> > PTX in an optimizer pass would be an ugly hack.
>
> So you _don't_ want to codegen in the optimizer.  Now I'm really
> confused.
>
The device code IR would be generated in the optimization pass, and
codegen'd when the host module is codegen'd.

The word "codegen" is overloaded here, as we're talking about IR
codegen
during optimization, and device codegen during host codegen.  Confusing,
no? :)

>
> > The cleaner solution is to store an LLVM-IR string in the host module
> > and to mark it with the llvm.codegen() intrinsic. When the module is
> > processed by the backend, the string is automatically translated to
> > PTX. This requires no additional file writing, introduces no layering
> > violations and seems to be very simple.
>
> Why do you need to store IR in a string?  It's already in the IR file
or
> you can put it into another file.  All you need is an _external_ tool to
> drive llc to process and codegen these multiple files (to multiple
> targets) and then another tool to suck up the accelerator code into a
> string in the host assembly file.  Then you assemble into an object.
>
> No IR changes and you end up with one object file.  No changes to build
> systems at all, it's all handled by a driver.
>
> llvm.codegen is completely unnecessary.
>
I believe the point Tobias is trying to make is that he wants to retain the
ability to pipe modules between tools and not worry about the modules ever
hitting disk, e.g.

opt -load GPUOptimizer.so -gpu-opt | llc -march=x86

where the module coming in to opt is just unoptimized host code, and the
module coming out of opt has embedded GPU IR.

The llvm.codegen() does solve this problem, but at the cost of too much
ambiguity.

>
>                                 -Dave
>


-- 

Thanks,

Justin Holewinski
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20120508/fe1e6b8c/attachment.html>

dag at cray.com

2012-May-08 17:47 UTC

head link

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

Justin Holewinski <justin.holewinski at gmail.com> writes:
> I believe the point Tobias is trying to make is that he wants to
> retain the ability to pipe modules between tools and not worry about
> the modules ever hitting disk, e.g.
>
> opt -load GPUOptimizer.so -gpu-opt | llc -march=x86
> where the module coming in to opt is just unoptimized host code, and the
module coming out of opt has embedded GPU IR.
So you want opt to extract kernels?

In that case, you'll have to replace llc with some external tool that
moves those kernels to separate IR files, invokes llc on them and then
links them back together.

                               -Dave

Tobias Grosser

2012-May-08 18:50 UTC

head link

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

On 05/08/2012 07:08 PM, Justin Holewinski wrote:> On Tue, May 8, 2012 at 9:29 AM, <dag at cray.com <mailto:dag at
cray.com>> wrote:
>
>     Tobias Grosser <tobias at grosser.es <mailto:tobias at
grosser.es>> writes:
>
>      > So why the intrinsic? I want to create the PTX string from an
LLVM-IR
>      > optimizer pass, that should be loaded into clang, dragonegg, opt,
..
>
>     You want to codegen in the optimizer?  I'm confused.
>
>
>      > An LLVM-IR optimizer pass does not have access to the file system
and
>      > it can not link to the LLVM back ends to directly create PTX.
>     Creating
>      > PTX in an optimizer pass would be an ugly hack.
>
>     So you _don't_ want to codegen in the optimizer.  Now I'm
really
>     confused.
>
>
> The device code IR would be generated in the optimization pass, and
> codegen'd when the host module is codegen'd.
>
> The word "codegen" is overloaded here, as we're talking about
IR codegen
> during optimization, and device codegen during host codegen.  Confusing,
> no? :)
Correct.
>      > The cleaner solution is to store an LLVM-IR string in the host
module
>      > and to mark it with the llvm.codegen() intrinsic. When the module
is
>      > processed by the backend, the string is automatically translated
to
>      > PTX. This requires no additional file writing, introduces no
layering
>      > violations and seems to be very simple.
>
>     Why do you need to store IR in a string?  It's already in the IR
file or
>     you can put it into another file.  All you need is an _external_ tool
to
>     drive llc to process and codegen these multiple files (to multiple
>     targets) and then another tool to suck up the accelerator code into a
>     string in the host assembly file.  Then you assemble into an object.
>
>     No IR changes and you end up with one object file.  No changes to build
>     systems at all, it's all handled by a driver.
>
>     llvm.codegen is completely unnecessary.
>
>
> I believe the point Tobias is trying to make is that he wants to retain
> the ability to pipe modules between tools and not worry about the
> modules ever hitting disk, e.g.
>
> opt -load GPUOptimizer.so -gpu-opt | llc -march=x86
>
> where the module coming in to opt is just unoptimized host code, and the
> module coming out of opt has embedded GPU IR.
True.
> The llvm.codegen() does solve this problem, but at the cost of too much
> ambiguity.
Can this be solved by documentation changes?

Tobi

Maybe Matching Threads

Search for more apparently analagous threads

llvm dev - May 2012 - [LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

Maybe Matching Threads