thr3ads.net - llvm dev - [LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation [May 2012]

If this information is useful, please help other people find it:
Share via:

dag at cray.com

2012-May-01 21:21 UTC

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

Justin Holewinski <justin.holewinski at gmail.com> writes:
>     For something like PTX, runtime calls take care of the call semantics
so
>     it is either up to the user or the frontend to set up the runtime calls
>     correctly.  We don't need to completely solve this problem.  Yet. 
:)
>
> But there has to be some interface that allows an LLVM IR function
> from one architecture to get at the code or name of a function from
> another architecture.  This could be handled in the front-end, but it
> seems like we could design some abstraction.
Doesn't LLVM support taking the address of a function in another address
space?  If not it probably should.
>     > If you have a global variable, what target "sees" it? 
Does it need to
>     > be annotated along with the function?
>    
> Sorry, I meant global variables in the LLVM IR.  Are they valid for
> only one architecture in the IR module?
Ah.  It very much depends on the system architecture.  Since current PTX
targets run in an entirely separate address space globals would have to
be replicated and copied to/from the device.  This might require
target-specific modules.

For a system with shared memory, I would assume the globals could simply
be shared "as usual."  Otherwise, it wouldn't be shared memory. 
In a
target-specific module design, one or the other would be an extern
reference.
>     > If you're targeting Cell, in contrast, you'd want to
compile both down
>     > to object files.
>    
>     I think we probably want to do that for PTX as well.
>
> Maybe, maybe not.  It may make sense to rely on run-time JIT'ing of the
PTX.
That happens regardless.  There is no way to produce instructions "to
the metal" for NVIDIA targets.  I was referring to PTX object files
above.
>     Do we allow more than one Module per file?  If not, that seems like an
>     arbitrary limitation.  If we allowed that we could have each module
>     specify a different target.
>
> That could work.
Given your questions about globals above, I think it might be a
requirement unless we want to require code for separate targets live in
separate files.  I think that's too restrictive because some opt pass
might want to extract kernels and put them on separate targets.

                              -Dave

Tobias Grosser

2012-May-07 08:15 UTC

head link

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

On 05/01/2012 11:21 PM, dag at cray.com wrote:> Justin Holewinski<justin.holewinski at gmail.com>  writes:
>
>>      For something like PTX, runtime calls take care of the call
semantics so
>>      it is either up to the user or the frontend to set up the runtime
calls
>>      correctly.  We don't need to completely solve this problem. 
Yet.  :)
>>
>> But there has to be some interface that allows an LLVM IR function
>> from one architecture to get at the code or name of a function from
>> another architecture.  This could be handled in the front-end, but it
>> seems like we could design some abstraction.
>
> Doesn't LLVM support taking the address of a function in another
address
> space?  If not it probably should.
Hi Dave,

I highly appreciate your idea of integrating heterogeneous computing 
features directly into LLVM-IR. I believe this can be a way worth going, 
but I doubt now is the right moment for it. I don't share your opinion 
that it is easy to move LLVM-IR in this direction, but I rather believe 
that this is an engineering project that will take several months of 
full time work. Possibly not the implementation itself, but designing 
it, discussing it, implementing it and ensuring that the new feature 
does not increase run-time and memory footprint or reduce 
maintainability of LLVM. Due to the large amount of changes that would 
be needed all over LLVM, I really think we should first get some 
experience in this area, before we burn this feature into LLVM-IR.

The llvm.codegen intrinsic seems the perfect match to build up such 
experience. It requires no changes to LLVM-IR itself and only very local 
changes to the generic back end infrastructure. It may possibly not be 
as generic as other solutions, but it is far from being an ugly hack. 
Quite in contrast, it is a close match for OpenCL like run times and 
works well with the existing PTX back end.

Do you have definitiv plans to add heterogeneous computing capabilities 
to LLVM-IR within the next couple (3-4) months? Will these capabilities 
superseed the llvm codegen intrinsic?

In case such plans do not exist, what do you think about adding the 
llvm.codegen() intrinsic for now? If mid-term plans exist for 
heterogeneous extensions to LLVM-IR, we can document them along the 
intrinsic.

Cheers
Tobi

Tobias Grosser

2012-May-07 08:31 UTC

head link

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

On 05/01/2012 11:21 PM, dag at cray.com wrote:> Justin Holewinski<justin.holewinski at gmail.com>  writes:
>>      Do we allow more than one Module per file?  If not, that seems
like an
>>      arbitrary limitation.  If we allowed that we could have each
module
>>      specify a different target.
>>
>> That could work.
>
> Given your questions about globals above, I think it might be a
> requirement unless we want to require code for separate targets live in
> separate files.  I think that's too restrictive because some opt pass
> might want to extract kernels and put them on separate targets.
I think we need several modules per file. Supporting AMDIL and PTX at 
the same time sounds more than useful.

Another question that pops up to me. If we go support several modules, 
how would the command line options to opt look like? Do we want to make 
all options sub-module specific? Getting this user friendly may be 
difficult. The same for the output of llc. At the moment llc can dump 
the assembly to stdout. Would you dump the assembly of the different 
modules to stdout or do you want to support multiple -o options to 
specify the various output files?

The same for the LLVM CodeGen/Target API. It must possibly be changed to 
support the output of several modules or the specification of different 
options for each module. We also have the same problems as Justin 
pointed out for the codegen intrinsic. Some llc options are globals, 
they would need to be made Codegen options, if we want to set them on a 
per-module basis.

Cheers
Tobi

dag at cray.com

2012-May-07 16:07 UTC

head link

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

Tobias Grosser <tobias at grosser.es> writes:
>> Doesn't LLVM support taking the address of a function in another
address
>> space?  If not it probably should.
>
> Hi Dave,
>
> I highly appreciate your idea of integrating heterogeneous computing
> features directly into LLVM-IR. I believe this can be a way worth
> going, but I doubt now is the right moment for it. I don't share your
> opinion that it is easy to move LLVM-IR in this direction, but I
> rather believe that this is an engineering project that will take
> several months of full time work. Possibly not the implementation
> itself, but designing it, discussing it, implementing it and ensuring
> that the new feature does not increase run-time and memory footprint
> or reduce maintainability of LLVM. Due to the large amount of changes
> that would be needed all over LLVM, I really think we should first get
> some experience in this area, before we burn this feature into
> LLVM-IR.
I'm not advocating that we rush into this by any means.  I'm well aware
that the discussions and experiments will take quite a while to plow
through.  I think a small set of enhancements will go a long way.  I'd
like to avoid hacking in special intrinsics like llvm.codegen that feel
so very much in opposition to the rest of the LLVM design.
> The llvm.codegen intrinsic seems the perfect match to build up such
> experience. It requires no changes to LLVM-IR itself and only very
> local changes to the generic back end infrastructure. It may possibly
> not be as generic as other solutions, but it is far from being an ugly
> hack. Quite in contrast, it is a close match for OpenCL like run times
> and works well with the existing PTX back end.
I'll bite my tongue on the designs of OpenCL and CUDA.  :)

But regardless, if those are your targets you don't need llvm.codegen at
all.
> Do you have definitiv plans to add heterogeneous computing
> capabilities to LLVM-IR within the next couple (3-4) months? Will
> these capabilities superseed the llvm codegen intrinsic?
No specific plans to change the IR.  We have not found a need such
changes on current architectures as the runtimes provided with those
architectures handles the ugly details.  I am thinking further into the
future and what might be needed there.
> In case such plans do not exist, what do you think about adding the
> llvm.codegen() intrinsic for now? If mid-term plans exist for
> heterogeneous extensions to LLVM-IR, we can document them along the
> intrinsic.
I think it's completely unnecessary if your goal is to get something
working on current hardware.

We do have certaint structural/software engineeering changes to the
implementation of LLVM's code generator that would be useful.  This
primarily is the ability to completely process one function before
moving onto the next.  This is important when dealing with heterogeneous
systems as one has to for example write out different asm for the
various targets at a function granularity.  But that doesn't require any
IR changes whatsoever.

                                   -Dave

dag at cray.com

2012-May-07 16:11 UTC

head link

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

Tobias Grosser <tobias at grosser.es> writes:
> I think we need several modules per file. Supporting AMDIL and PTX at
> the same time sounds more than useful.
Yes.
> Another question that pops up to me. If we go support several modules,
> how would the command line options to opt look like? Do we want to
> make all options sub-module specific? Getting this user friendly may
> be difficult. The same for the output of llc. At the moment llc can
> dump the assembly to stdout. Would you dump the assembly of the
> different modules to stdout or do you want to support multiple -o
> options to specify the various output files?
I think you're making this too complicated.  I think opt should continue
to work the way it does now.  Apply the same flags to all modules.  If
the user wants different transformations based on target either the
target characteristics should inform the optimizer or the file should be
split into multiple IR files.
> The same for the LLVM CodeGen/Target API. It must possibly be changed
> to support the output of several modules or the specification of
> different options for each module. We also have the same problems as
> Justin pointed out for the codegen intrinsic. Some llc options are
> globals, they would need to be made Codegen options, if we want to set
> them on a per-module basis.
Can you give me some examples?  What kinds of options would be
target-specific and not implied by the target attribute on the Module?

                              -Dave

dag at cray.com

2012-May-07 16:13 UTC

head link

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

Tobias Grosser <tobias at grosser.es> writes:
> Would you dump the assembly of the different modules to stdout or do
> you want to support multiple -o options to specify the various output
> files?
I forgot to address this one.  With current OpenCL and CUDA
specifications, there's no need to do multiple .o files.  In my mind,
llc should output one .o (one .s, etc.).  Anything else wreaks havoc on
build systems.

But Chris has the final say, I think.

                               -Dave

Possibly Parallel Threads

Search for more possibly parallel threads

llvm dev - May 2012 - [LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

Possibly Parallel Threads