thr3ads.net - llvm dev - [LLVMdev] Backend vs JIT : GPU [Oct 2013]

If this information is useful, please help other people find it:
Share via:

Gopal Rastogi

2013-Oct-09 15:13 UTC

[LLVMdev] Backend vs JIT : GPU

Hi guys,

I am understanding OpenCL compilation flow on GPU in order to develop
OpenCL runtime for a new hardware.

I understood that OpenCL compiler is part of a vendor's runtime library
which is the heart of OpenCL. Since OpenCL kernel is compiled at runtime,
hence at high level its compilation takes place in two steps:
i.  source code is first converted to intermediate code.
ii. intermediate code is then translated to targeted binary code.

let say for example, we have a OpenCL kernel source code
vectorAdd_kernel.cl :
1. OpenCL compilation flow on Nvidia GPUs
   a. vectorAdd_kernel.cl is first translated to LLVM IR using clang and
   b. LLVM IR is converted into optimized LLVM IR using LLVM optimizer.
   b. optimized LLVM IR is then translated to vectorAdd_kernel.ptx using
Back-end
   c. vectorAdd_kernel.ptx is then translated to vectorAdd_kernel.bin file
using JIT. Nvidia uses JIT to get benefit in-case when next-generation GPUs
are encounterd.

2. OpenCL compilation on AMD GPUs
  a. vectorAdd_kernel.cl is first translated to LLVM IR using gcc/clang
  b. LLVM IR is then converted into optimzed LLVM IR using LLVM optimizer.
  c. optimized LLVM IR is then converted into AMD IL.
  d. AMD IL is then converted into AMD ISA using shader compiler (GPU
JIT).

I understand that AMD uses back-end compilation as part of JIT, instead
Nvidia which uses back-end separate from JIT.

Is that correct? If it is so then what are the advantages of using JIT
separate from back-end?

Thanks for your comments/opinions,
-Gopal
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131009/0d265b33/attachment.html>

Dmitry Mikushin

2013-Oct-09 15:38 UTC

head link

[LLVMdev] Backend vs JIT : GPU

Hi Gopal,

The reason is absence/presence of open-source IR->ISA translation component.

1.c vectorAdd_kernel.ptx is then translated to vectorAdd_kernel.cubin
containing device-specific binary assembly. Translation is performed either
by NVIDIA CUDA runtime library (see e.g. cuModuleLoad), which is referred
as JIT, or with ptxas command line tool. In both cases, translation stage
involves closed-source components of NVIDIA CUDA toolkit, which are not
part of LLVM. There are some alternatives, such as NVVM, asfermi, and
PathScale.

AFAIK, AMD pipleline in contrast has two options: closed-source (Catalyst)
and open-source driver.

Best,
- D.



2013/10/9 Gopal Rastogi <gopalrastogi.mmmec at gmail.com>
> Hi guys,
>
> I am understanding OpenCL compilation flow on GPU in order to develop
> OpenCL runtime for a new hardware.
>
> I understood that OpenCL compiler is part of a vendor's runtime library
> which is the heart of OpenCL. Since OpenCL kernel is compiled at runtime,
> hence at high level its compilation takes place in two steps:
> i.  source code is first converted to intermediate code.
> ii. intermediate code is then translated to targeted binary code.
>
> let say for example, we have a OpenCL kernel source code
> vectorAdd_kernel.cl :
> 1. OpenCL compilation flow on Nvidia GPUs
>    a. vectorAdd_kernel.cl is first translated to LLVM IR using clang and
>    b. LLVM IR is converted into optimized LLVM IR using LLVM optimizer.
>     b. optimized LLVM IR is then translated to vectorAdd_kernel.ptx using
> Back-end
>    c. vectorAdd_kernel.ptx is then translated to vectorAdd_kernel.bin file
> using JIT. Nvidia uses JIT to get benefit in-case when next-generation GPUs
> are encounterd.
>
> 2. OpenCL compilation on AMD GPUs
>   a. vectorAdd_kernel.cl is first translated to LLVM IR using gcc/clang
>   b. LLVM IR is then converted into optimzed LLVM IR using LLVM optimizer.
>   c. optimized LLVM IR is then converted into AMD IL.
>   d. AMD IL is then converted into AMD ISA using shader compiler (GPU
> JIT).
>
> I understand that AMD uses back-end compilation as part of JIT, instead
> Nvidia which uses back-end separate from JIT.
>
> Is that correct? If it is so then what are the advantages of using JIT
> separate from back-end?
>
> Thanks for your comments/opinions,
> -Gopal
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131009/38e6b39a/attachment.html>

Micah Villmow

2013-Oct-09 15:44 UTC

head link

[LLVMdev] Backend vs JIT : GPU

Gopal,
 I gave a presentation on how AMD compiles here:
http://llvm.org/devmtg/2010-11/Villmow-OpenCL.pdf

Micah

From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Gopal Rastogi
Sent: Wednesday, October 09, 2013 8:13 AM
To: llvmdev at cs.uiuc.edu
Subject: [LLVMdev] Backend vs JIT : GPU

Hi guys,

I am understanding OpenCL compilation flow on GPU in order to develop OpenCL
runtime for a new hardware.

I understood that OpenCL compiler is part of a vendor's runtime library
which is the heart of OpenCL. Since OpenCL kernel is compiled at runtime, hence
at high level its compilation takes place in two steps:
i.  source code is first converted to intermediate code.
ii. intermediate code is then translated to targeted binary code.

let say for example, we have a OpenCL kernel source code vectorAdd_kernel.cl :
1. OpenCL compilation flow on Nvidia GPUs
   a. vectorAdd_kernel.cl is first translated to LLVM IR using clang and
   b. LLVM IR is converted into optimized LLVM IR using LLVM optimizer.
   b. optimized LLVM IR is then translated to vectorAdd_kernel.ptx using
Back-end
   c. vectorAdd_kernel.ptx is then translated to vectorAdd_kernel.bin file using
JIT. Nvidia uses JIT to get benefit in-case when next-generation GPUs are
encounterd.
2. OpenCL compilation on AMD GPUs
  a. vectorAdd_kernel.cl is first translated to LLVM IR using gcc/clang
  b. LLVM IR is then converted into optimzed LLVM IR using LLVM optimizer.
  c. optimized LLVM IR is then converted into AMD IL.
  d. AMD IL is then converted into AMD ISA using shader compiler (GPU JIT).
I understand that AMD uses back-end compilation as part of JIT, instead Nvidia
which uses back-end separate from JIT.
Is that correct? If it is so then what are the advantages of using JIT separate
from back-end?

Thanks for your comments/opinions,
-Gopal
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131009/6b55e58b/attachment.html>

Possibly Parallel Threads

Search for more apparently analagous threads

llvm dev - Oct 2013 - [LLVMdev] Backend vs JIT : GPU

[LLVMdev] Backend vs JIT : GPU

[LLVMdev] Backend vs JIT : GPU

[LLVMdev] Backend vs JIT : GPU

Possibly Parallel Threads