thr3ads.net - llvm dev - [LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation [May 2012]

If this information is useful, please help other people find it:
Share via:

dag at cray.com

2012-May-01 15:22 UTC

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

Justin Holewinski <justin.holewinski at gmail.com> writes:
>     I don't think the code base changes are all that bad.  We have a
number
>     of them to support generating code one function at a time rather than a
>     whole module together.  They've been sitting around waiting for us
to
>     send them upstream.  It would be an easy matter to simply annotate each
>     function with its target.  We don't currently do that because we
never
>     write out such IR files but it seems like a simple problem to solve to
>     me.
>
> If such changes are almost ready to be up-streamed, then great!
Just to clariofy, the current changes simply allow a function to be
completely processed (including asm generation) before the next function
is sent to codegen.
> It just seems like a fairly non-trivial task to actually implement
> function-level target selection, especially when you consider function
> call semantics, taking the address of a function, etc.
For something like PTX, runtime calls take care of the call semantics so
it is either up to the user or the frontend to set up the runtime calls
correctly.  We don't need to completely solve this problem.  Yet.  :)
> If you have a global variable, what target "sees" it?  Does it
need to
> be annotated along with the function?  
For a tool like llc, wouldn't it be simply a matter of changing
TheTarget and reconstituting the various passes?  The changes we have
waiting to upstream already allow us to reconstitute passes.  I
sometimes use this to turn on/off debugging on a function-level basis.

The way we've constructed our backend interface should just allow us to
switch the target and reinitialize everything.  I'm sure I'm glossing
over tons of details but I don't see a fundamental architectural problem
in LLVM that would prevent this.
> Can functions from two different targets share this pointer?  
Again, in the case of PTX it's the runtime's responsibility to ensure
this.  I agree passing pointers around complicates things in the general
case but I also think it's a solvable problem.
> For Yabin's use-case, the X86 portions need to be compiled to
> assembly, or even an object file, while the PTX portions need to be
> lowered to an assembly string and embedded in the X86 source (or
> written to disk somewhere).  
I think it's just a matter of switching to a different AsmWriter.  The
PTX runtime can load objects from files.  The code doesn't have to be a
string in the x86 object file.
> If you're targeting Cell, in contrast, you'd want to compile both
down
> to object files.
I think we probably want to do that for PTX as well.
> For me, the bigger question is: do we extend the IR to support
> multiple targets, or do we keep the one-target-per-module philosophy
> and derive some other way of representing how the modules fit
> together?  I can see pros and cons for both approaches.
Me too.
> What if instead of per-function annotations, we implement something
> like module file sections?  You could organize a module file into
> logical sections based on target architecture.  I'm just throwing that
> out there.
Do we allow more than one Module per file?  If not, that seems like an
arbitrary limitation.  If we allowed that we could have each module
specify a different target.

                                 -Dave

Justin Holewinski

2012-May-01 18:31 UTC

head link

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

On Tue, May 1, 2012 at 8:22 AM, <dag at cray.com> wrote:
> Justin Holewinski <justin.holewinski at gmail.com> writes:
>
> >     I don't think the code base changes are all that bad.  We have
a
> number
> >     of them to support generating code one function at a time rather
> than a
> >     whole module together.  They've been sitting around waiting
for us to
> >     send them upstream.  It would be an easy matter to simply annotate
> each
> >     function with its target.  We don't currently do that because
we
> never
> >     write out such IR files but it seems like a simple problem to
solve
> to
> >     me.
> >
> > If such changes are almost ready to be up-streamed, then great!
>
> Just to clariofy, the current changes simply allow a function to be
> completely processed (including asm generation) before the next function
> is sent to codegen.
>
> > It just seems like a fairly non-trivial task to actually implement
> > function-level target selection, especially when you consider function
> > call semantics, taking the address of a function, etc.
>
> For something like PTX, runtime calls take care of the call semantics so
> it is either up to the user or the frontend to set up the runtime calls
> correctly.  We don't need to completely solve this problem.  Yet.  :)
>
But there has to be some interface that allows an LLVM IR function from one
architecture to get at the code or name of a function from another
architecture.  This could be handled in the front-end, but it seems like we
could design some abstraction.

>
> > If you have a global variable, what target "sees" it?  Does
it need to
> > be annotated along with the function?
>
> For a tool like llc, wouldn't it be simply a matter of changing
> TheTarget and reconstituting the various passes?  The changes we have
> waiting to upstream already allow us to reconstitute passes.  I
> sometimes use this to turn on/off debugging on a function-level basis.
>
> The way we've constructed our backend interface should just allow us to
> switch the target and reinitialize everything.  I'm sure I'm
glossing
> over tons of details but I don't see a fundamental architectural
problem
> in LLVM that would prevent this.
>
Sorry, I meant global variables in the LLVM IR.  Are they valid for only
one architecture in the IR module?

>
> > Can functions from two different targets share this pointer?
>
> Again, in the case of PTX it's the runtime's responsibility to
ensure
> this.  I agree passing pointers around complicates things in the general
> case but I also think it's a solvable problem.
>
> > For Yabin's use-case, the X86 portions need to be compiled to
> > assembly, or even an object file, while the PTX portions need to be
> > lowered to an assembly string and embedded in the X86 source (or
> > written to disk somewhere).
>
> I think it's just a matter of switching to a different AsmWriter.  The
> PTX runtime can load objects from files.  The code doesn't have to be a
> string in the x86 object file.
>
> > If you're targeting Cell, in contrast, you'd want to compile
both down
> > to object files.
>
> I think we probably want to do that for PTX as well.
>
Maybe, maybe not.  It may make sense to rely on run-time JIT'ing of the PTX.

>
> > For me, the bigger question is: do we extend the IR to support
> > multiple targets, or do we keep the one-target-per-module philosophy
> > and derive some other way of representing how the modules fit
> > together?  I can see pros and cons for both approaches.
>
> Me too.
>
> > What if instead of per-function annotations, we implement something
> > like module file sections?  You could organize a module file into
> > logical sections based on target architecture.  I'm just throwing
that
> > out there.
>
> Do we allow more than one Module per file?  If not, that seems like an
> arbitrary limitation.  If we allowed that we could have each module
> specify a different target.
>
That could work.

>
>                                 -Dave
>


-- 

Thanks,

Justin Holewinski
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20120501/b3638e1a/attachment.html>

dag at cray.com

2012-May-01 21:21 UTC

head link

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

Justin Holewinski <justin.holewinski at gmail.com> writes:
>     For something like PTX, runtime calls take care of the call semantics
so
>     it is either up to the user or the frontend to set up the runtime calls
>     correctly.  We don't need to completely solve this problem.  Yet. 
:)
>
> But there has to be some interface that allows an LLVM IR function
> from one architecture to get at the code or name of a function from
> another architecture.  This could be handled in the front-end, but it
> seems like we could design some abstraction.
Doesn't LLVM support taking the address of a function in another address
space?  If not it probably should.
>     > If you have a global variable, what target "sees" it? 
Does it need to
>     > be annotated along with the function?
>    
> Sorry, I meant global variables in the LLVM IR.  Are they valid for
> only one architecture in the IR module?
Ah.  It very much depends on the system architecture.  Since current PTX
targets run in an entirely separate address space globals would have to
be replicated and copied to/from the device.  This might require
target-specific modules.

For a system with shared memory, I would assume the globals could simply
be shared "as usual."  Otherwise, it wouldn't be shared memory. 
In a
target-specific module design, one or the other would be an extern
reference.
>     > If you're targeting Cell, in contrast, you'd want to
compile both down
>     > to object files.
>    
>     I think we probably want to do that for PTX as well.
>
> Maybe, maybe not.  It may make sense to rely on run-time JIT'ing of the
PTX.
That happens regardless.  There is no way to produce instructions "to
the metal" for NVIDIA targets.  I was referring to PTX object files
above.
>     Do we allow more than one Module per file?  If not, that seems like an
>     arbitrary limitation.  If we allowed that we could have each module
>     specify a different target.
>
> That could work.
Given your questions about globals above, I think it might be a
requirement unless we want to require code for separate targets live in
separate files.  I think that's too restrictive because some opt pass
might want to extract kernels and put them on separate targets.

                              -Dave

Possibly Parallel Threads

Search for more possibly parallel threads

llvm dev - May 2012 - [LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

[LLVMdev] [PATCH][RFC] Add llvm.codegen Intrinsic To Support Embedded LLVM IR Code Generation

Possibly Parallel Threads