thr3ads.net - llvm dev - [LLVMdev] GSoC: PTX Back-End for LLVM [Mar 2011]

If this information is useful, please help other people find it:
Share via:

Justin Holewinski

2011-Mar-28 13:12 UTC

[LLVMdev] GSoC: PTX Back-End for LLVM

Hi All,

I am going to submit a GSoC proposal for LLVM this year, and I would like to
first post it here to get constructive feedback before I submit it before
the April 8 deadline.  This is the first time I have submitted a GSoC
proposal, so please be brutal with the feedback. :)

Additionally, Che-Liang Chiou (the code owner of the PTX back-end) has
agreed to be my mentor if this is accepted.  What does he need to do to
become an official mentor?




=======Overview
=======
The NVidia Parallel Thread eXecution (PTX) language is an assembly-like
language that is used as an intermediate format for all GPU programs that
execute on NVidia hardware.  It is similar to many other three-address
assembly formats, and hence is a great target for the LLVM code generation
framework.  Having a supported PTX code generator back-end in LLVM would
allow users of LLVM to generate GPU code directly from LLVM IR, with
appropriate use of PTX-specific intrinsics to support features such as
thread/block id queries, texture sampling, and prefetching.


=====Status
=====
For the last month, I have been working with Che-Liang Chiou (the code owner
of the PTX back-end) to implement basic support for PTX code generation
within the LLVM source tree.  Currently, the back-end is capable of handling
a small sub-set of LLVM IR, including integer and floating-point arithmetic,
loads/stores, and basic branching.  While this is enough to support basic
computational kernels, there is still much to be done to support arbitrary
LLVM IR.


=============Qualifications
=============
As I have already contributed significant portions of code to the current
PTX back-end, the learning curve for this project would be minimal.  I am
already comfortable working with the core LLVM libraries, as well as the
LLVM code generation and selection DAG libraries.  I have also been working
with C/C++ for over 15 years.

I am currently a PhD student at the Ohio State University, pursuing a degree
in Computer Science and Engineering.  My research focus is high-performance
code generation for multi-core and many-core architectures, specifically
current GPU architectures.  I am primarily interested in the compiler
technology to drive this.  My interest in the PTX back-end started with a
research interest for generating high-performance GPU code for stencil
computations.  While the PTX back-end is not my research focus, it is an
important part of the infrastructure needed for a planned research compiler.
 I also have a personal interest in GPU code generation for graphics
applications.


=======Proposal
=======
For the 2011 Google Summer of Code program, I propose to implement the
pieces of the PTX back-end that are currently missing or error-prone.  This
includes, but is not limited to,

    * Implementing efficient instruction selection for floating-point IR
instructions
      - e.g., Selecting the most efficient instructions for different
hardware
    * Implementing the full range of integer and floating-point comparison
instructions
    * Implementing function calls
    * Implementing jump tables
    * Implementing the full range of LLVM intrinsics needed for
"special"
PTX instructions
      - e.g. texture mapping, prefetching
    * Implementing support for v4f32 and similar vector types

In addition to these basic milestones, the driving goal would be to allow
the PTX back-end to generate correct and efficient code for LLVM IR versions
of the samples contained in the NVidia GPU Computing SDK.  In other words, I
want to be able to take the CUDA code from the SDK samples, generate LLVM IR
with Clang (with appropriate source-level syntactic modifications), and be
able to generate efficient PTX code that is close in performance to that
generated by the NVidia nvcc compiler.  My limited testing so far has shown
that code generated from the PTX back-end in its current form is able to
come within 10% of the performance of identical code compiled with nvcc, and
in some cases even marginally beats nvcc.

To accomplish this goal, I propose a two-phase implementation.  In the first
phase, I will implement as much of the PTX ISA as is representable in LLVM
IR, and produce LLVM IR intrinsics for the rest.  The goal of the first
phase will be to generate correct PTX code for arbitrary LLVM IR input.
 However, some exceptions will be necessary; for example, it is currently
not feasible to implement exception handling within PTX.  After the code
generator is able to generate correct code for a large set of complex LLVM
IR input (including real-world computational kernels originally written in
CUDA), I will begin phase two.  In phase two, I would like to begin
optimizing the PTX back-end to generate efficient code.  This will involve
work on the instruction scheduler to take advantage of the instruction
pipeline on the GPU hardware, as well as potentially involving the register
allocator.


=================Advantage for LLVM
=================
The advantage of this project for the LLVM community would be the creation
and maintenance of a functionally-complete code generator for NVidia GPU
hardware that can be eventually tied to the OpenCL and CUDA front-ends for
Clang.  It would be the first LLVM code generator for GPU architectures that
would be a part of upstream LLVM.  This would expand the range of influence
of LLVM to include GPU architectures, out-of-the-box.  Additionally, the
work in this proposal should be complete within the LLVM 3.0 timeline.


==========Future Work
==========
In the future, the PTX back-end can be tied to the up-and-coming CUDA and
OpenCL front-ends within Clang.  This would provide a completely open-source
implementation of both OpenCL and CUDA for NVidia hardware, with the only
dependency being the NVidia CUDA SDK.  While this integration work is
outside of the scope of this proposal, it is a good future use-case for the
PTX back-end.  However, I do not know the timelines regarding the
implementation of these two front-ends, so I am unable to make any
guarantees regarding this GSoC proposal.


=====Mentor
=====
The code owner of the PTX back-end, Che-Liang Chiou, has agreed to mentor me
for this project if it is accepted this year.  However, I would love
feedback from others working on the back-end code generators within LLVM.


-- 

Thanks,

Justin Holewinski
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110328/3e543ab6/attachment.html>

Tanya Lattner

2011-Mar-28 21:15 UTC

head link

[LLVMdev] GSoC: PTX Back-End for LLVM

On Mar 28, 2011, at 6:12 AM, Justin Holewinski wrote:
> Hi All,
> 
> I am going to submit a GSoC proposal for LLVM this year, and I would like
to first post it here to get constructive feedback before I submit it before the
April 8 deadline.  This is the first time I have submitted a GSoC proposal, so
please be brutal with the feedback. :)
> 
> Additionally, Che-Liang Chiou (the code owner of the PTX back-end) has
agreed to be my mentor if this is accepted.  What does he need to do to become
an official mentor?
> 
> 
He needs to sign up to be a mentor:
http://www.google-melange.com/gsoc/org/google/gsoc2011/llvm

Scroll down to "Or register as a mentor"

-Tanya

> 
> 
> =======> Overview
> =======> 
> The NVidia Parallel Thread eXecution (PTX) language is an assembly-like
language that is used as an intermediate format for all GPU programs that
execute on NVidia hardware.  It is similar to many other three-address assembly
formats, and hence is a great target for the LLVM code generation framework. 
Having a supported PTX code generator back-end in LLVM would allow users of LLVM
to generate GPU code directly from LLVM IR, with appropriate use of PTX-specific
intrinsics to support features such as thread/block id queries, texture
sampling, and prefetching.
> 
> 
> =====> Status
> =====> 
> For the last month, I have been working with Che-Liang Chiou (the code
owner of the PTX back-end) to implement basic support for PTX code generation
within the LLVM source tree.  Currently, the back-end is capable of handling a
small sub-set of LLVM IR, including integer and floating-point arithmetic,
loads/stores, and basic branching.  While this is enough to support basic
computational kernels, there is still much to be done to support arbitrary LLVM
IR.
> 
> 
> =============> Qualifications
> =============> 
> As I have already contributed significant portions of code to the current
PTX back-end, the learning curve for this project would be minimal.  I am
already comfortable working with the core LLVM libraries, as well as the LLVM
code generation and selection DAG libraries.  I have also been working with
C/C++ for over 15 years.
> 
> I am currently a PhD student at the Ohio State University, pursuing a
degree in Computer Science and Engineering.  My research focus is
high-performance code generation for multi-core and many-core architectures,
specifically current GPU architectures.  I am primarily interested in the
compiler technology to drive this.  My interest in the PTX back-end started with
a research interest for generating high-performance GPU code for stencil
computations.  While the PTX back-end is not my research focus, it is an
important part of the infrastructure needed for a planned research compiler.  I
also have a personal interest in GPU code generation for graphics applications.
> 
> 
> =======> Proposal
> =======> 
> For the 2011 Google Summer of Code program, I propose to implement the
pieces of the PTX back-end that are currently missing or error-prone.  This
includes, but is not limited to,
> 
>     * Implementing efficient instruction selection for floating-point IR
instructions
>       - e.g., Selecting the most efficient instructions for different
hardware
>     * Implementing the full range of integer and floating-point comparison
instructions
>     * Implementing function calls
>     * Implementing jump tables
>     * Implementing the full range of LLVM intrinsics needed for
"special" PTX instructions
>       - e.g. texture mapping, prefetching
>     * Implementing support for v4f32 and similar vector types
> 
> In addition to these basic milestones, the driving goal would be to allow
the PTX back-end to generate correct and efficient code for LLVM IR versions of
the samples contained in the NVidia GPU Computing SDK.  In other words, I want
to be able to take the CUDA code from the SDK samples, generate LLVM IR with
Clang (with appropriate source-level syntactic modifications), and be able to
generate efficient PTX code that is close in performance to that generated by
the NVidia nvcc compiler.  My limited testing so far has shown that code
generated from the PTX back-end in its current form is able to come within 10%
of the performance of identical code compiled with nvcc, and in some cases even
marginally beats nvcc.
> 
> To accomplish this goal, I propose a two-phase implementation.  In the
first phase, I will implement as much of the PTX ISA as is representable in LLVM
IR, and produce LLVM IR intrinsics for the rest.  The goal of the first phase
will be to generate correct PTX code for arbitrary LLVM IR input.  However, some
exceptions will be necessary; for example, it is currently not feasible to
implement exception handling within PTX.  After the code generator is able to
generate correct code for a large set of complex LLVM IR input (including
real-world computational kernels originally written in CUDA), I will begin phase
two.  In phase two, I would like to begin optimizing the PTX back-end to
generate efficient code.  This will involve work on the instruction scheduler to
take advantage of the instruction pipeline on the GPU hardware, as well as
potentially involving the register allocator.
> 
> 
> =================> Advantage for LLVM
> =================> 
> The advantage of this project for the LLVM community would be the creation
and maintenance of a functionally-complete code generator for NVidia GPU
hardware that can be eventually tied to the OpenCL and CUDA front-ends for
Clang.  It would be the first LLVM code generator for GPU architectures that
would be a part of upstream LLVM.  This would expand the range of influence of
LLVM to include GPU architectures, out-of-the-box.  Additionally, the work in
this proposal should be complete within the LLVM 3.0 timeline.
> 
> 
> ==========> Future Work
> ==========> 
> In the future, the PTX back-end can be tied to the up-and-coming CUDA and
OpenCL front-ends within Clang.  This would provide a completely open-source
implementation of both OpenCL and CUDA for NVidia hardware, with the only
dependency being the NVidia CUDA SDK.  While this integration work is outside of
the scope of this proposal, it is a good future use-case for the PTX back-end. 
However, I do not know the timelines regarding the implementation of these two
front-ends, so I am unable to make any guarantees regarding this GSoC proposal.
> 
> 
> =====> Mentor
> =====> 
> The code owner of the PTX back-end, Che-Liang Chiou, has agreed to mentor
me for this project if it is accepted this year.  However, I would love feedback
from others working on the back-end code generators within LLVM.
> 
> 
> -- 
> 
> Thanks,
> 
> Justin Holewinski
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110328/2bf7c7db/attachment.html>

陳韋任

2011-Mar-29 01:00 UTC

head link

[LLVMdev] GSoC: PTX Back-End for LLVM

Hi, Justin
> I am going to submit a GSoC proposal for LLVM this year, and I would like
to
> first post it here to get constructive feedback before I submit it before
> the April 8 deadline.  This is the first time I have submitted a GSoC
> proposal, so please be brutal with the feedback. :)
  Can I join this project, if possible? I am also interested in PTX
backend.

Regards,
chenwj

-- 
Wei-Ren Chen (陳韋任)
Computer System Lab, Institute of Information Science,
Academia Sinica, Taiwan (R.O.C.)
Tel:886-2-2788-3799 #1667

Tobias Grosser

2011-Mar-29 01:48 UTC

head link

[LLVMdev] GSoC: PTX Back-End for LLVM

On 03/28/2011 09:00 PM, 陳韋任 wrote:> Hi, Justin
>
>> I am going to submit a GSoC proposal for LLVM this year, and I would
like to
>> first post it here to get constructive feedback before I submit it
before
>> the April 8 deadline.  This is the first time I have submitted a GSoC
>> proposal, so please be brutal with the feedback. :)
>
>    Can I join this project, if possible? I am also interested in PTX
> backend.
Hi chenwj,

you can obviously contribute to the LLVM PTX backend and I am sure 
Justin will help to review your patches.

However, if you are asking in respect of the Google Summer of Code, then 
it is not possible for two students to apply together for one project. 
You can apply for the same project as Justin and LLVM is free to take 
both of you. However, most probably the better application will win. 
Another option is that you discuss with Justin and Che-Liang Chiou to 
see if it is possible for both of you to focus on different parts of the 
PTX code generation, such that both of you work in the same area, but on 
different sub-projects. Another project closely related PTX  may e.g. 
include work on the clang CUDA or OpenCL frontend which Justin pointed 
out as still being unfinished. See in his "Future Work" section.

Cheers
Tobi

Tobias Grosser

2011-Mar-29 02:19 UTC

head link

[LLVMdev] GSoC: PTX Back-End for LLVM

On 03/28/2011 09:12 AM, Justin Holewinski wrote:> Hi All,
>
> I am going to submit a GSoC proposal for LLVM this year, and I would
> like to first post it here to get constructive feedback before I submit
> it before the April 8 deadline.  This is the first time I have submitted
> a GSoC proposal, so please be brutal with the feedback. :)
Hi Justin,

I think this is a great idea. I am highly interested in PTX code generation.

[...Proposal...]

The proposal is nice and shows that you already have a good idea of your 
project.

Here some ideas how you can further improve it:

1. Milestones / Time line

You already have a two-phase development plan. I believe it would be 
nice, if you can further split it into a set of smaller milestones. Each 
could include a short description of what you plan to deliver, how long 
its implementation will take and when you plan to implement it during 
the summer of code. Those milestones could be sorted into the time frame 
you have for the GSoC. In addition, you could define "Success
Criteria"
for the midterm/final evaluation.

This will make it easy to see during GSoC, if you are on track with your 
project and will allow you and your mentor to readjust your milestones 
if necessary.

When developing mile stones and success criteria, better be conservative 
and only add items you are confident you can implement during GSoC. You 
can add additionally a set of "if time permits" milestones, where you 
put the stuff that is not 100% needed, but that would be good to have.

2. It would be nice to include a description of the examples you have 
already tested

3. Define the exceptions

It would be good to know what parts you definitely do not plan to 
implement and best why not (Postponed, impossible, not relevant, ...).
Like this people can understand to what extend your backend will be 
usable after the GSoC.

4. Phase two is currently a little short

What kind of optimizations do you plan? Have you already an idea or will 
you investigate this when you get to this point? How much time do you 
plan to spend on Part II? If it is more than two weeks, it would be good 
to elaborate a little on what you plan to do there exactly.

So that's all for the moment. As the application was already nice, I 
just did some conceptually nitpicking. ;-)

Cheers
Tobi

Justin Holewinski

2011-Mar-31 16:24 UTC

head link

[LLVMdev] GSoC: PTX Back-End for LLVM

On Mon, Mar 28, 2011 at 10:19 PM, Tobias Grosser
<grosser at fim.uni-passau.de>wrote:
> On 03/28/2011 09:12 AM, Justin Holewinski wrote:
>
>> Hi All,
>>
>> I am going to submit a GSoC proposal for LLVM this year, and I would
>> like to first post it here to get constructive feedback before I submit
>> it before the April 8 deadline.  This is the first time I have
submitted
>> a GSoC proposal, so please be brutal with the feedback. :)
>>
>
> Hi Justin,
>
> I think this is a great idea. I am highly interested in PTX code
> generation.
>
> [...Proposal...]
>
> The proposal is nice and shows that you already have a good idea of your
> project.
>
> Here some ideas how you can further improve it:
>
> 1. Milestones / Time line
>
> You already have a two-phase development plan. I believe it would be nice,
> if you can further split it into a set of smaller milestones. Each could
> include a short description of what you plan to deliver, how long its
> implementation will take and when you plan to implement it during the
summer
> of code. Those milestones could be sorted into the time frame you have for
> the GSoC. In addition, you could define "Success Criteria" for
the
> midterm/final evaluation.
>
> This will make it easy to see during GSoC, if you are on track with your
> project and will allow you and your mentor to readjust your milestones if
> necessary.
>
> When developing mile stones and success criteria, better be conservative
> and only add items you are confident you can implement during GSoC. You can
> add additionally a set of "if time permits" milestones, where you
put the
> stuff that is not 100% needed, but that would be good to have.
>
> 2. It would be nice to include a description of the examples you have
> already tested
>
> 3. Define the exceptions
>
> It would be good to know what parts you definitely do not plan to implement
> and best why not (Postponed, impossible, not relevant, ...).
> Like this people can understand to what extend your backend will be usable
> after the GSoC.
>
> 4. Phase two is currently a little short
>
> What kind of optimizations do you plan? Have you already an idea or will
> you investigate this when you get to this point? How much time do you plan
> to spend on Part II? If it is more than two weeks, it would be good to
> elaborate a little on what you plan to do there exactly.
>
>
> So that's all for the moment. As the application was already nice, I
just
> did some conceptually nitpicking. ;-)
>
Thanks for the comments!

I have updated the proposal; it can be found at:
https://sites.google.com/site/justinholewinski/projects/gsoc/llvm-ptx-back-end-2011

Please let me know if you have any comments before I submit it to Melange in
the next few days!


>
> Cheers
> Tobi
>
>
>
>
>
>
>
>
>

-- 

Thanks,

Justin Holewinski
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110331/c03ba10a/attachment.html>

Reasonably Related Threads

Search for more reasonably related threads

llvm dev - Mar 2011 - [LLVMdev] GSoC: PTX Back-End for LLVM

[LLVMdev] GSoC: PTX Back-End for LLVM

[LLVMdev] GSoC: PTX Back-End for LLVM

[LLVMdev] GSoC: PTX Back-End for LLVM

[LLVMdev] GSoC: PTX Back-End for LLVM

[LLVMdev] GSoC: PTX Back-End for LLVM

[LLVMdev] GSoC: PTX Back-End for LLVM

Reasonably Related Threads