On 12/14/2011 02:41 PM, Justin Holewinski wrote:> I would favor calling conventions over metadata for the simple reason > that this maps more cleanly to the device model. Device and kernel > functions are represented differently in PTX, including (sometimes) the > way parameters are passed.For the record, marking the kernels with "calling conventions" instead of metadata is fine also for the pocl use case. It's enough if there is a way to differentiate OpenCL C kernels from the "device functions" for the reason I discussed in the previous email. That is, in the pocl point of view we just need a way to pick the "host-callable" kernel functions as they need the special treatment before they can be called (like a C function). BTW what about the other OpenCL data like required_wg_size which affect the possible "kernel treatment" of pocl and can be converted to some special instructions (I suppose) for the SIMT targets? Currently only the TCE target in Clang adds metadata for the required_wg_size kernel attribute (as we need it in "offline compilation") but IMHO that could be useful in general, as a default metadata (to enable its support in pocl for all targets, for example). -- Pekka
2011/12/14 Pekka Jääskeläinen <pekka.jaaskelainen at tut.fi>> On 12/14/2011 02:41 PM, Justin Holewinski wrote: > >> I would favor calling conventions over metadata for the simple reason >> that this maps more cleanly to the device model. Device and kernel >> functions are represented differently in PTX, including (sometimes) the >> way parameters are passed. >> > > For the record, marking the kernels with "calling conventions" instead > of metadata is fine also for the pocl use case. It's enough if there is a > way > to differentiate OpenCL C kernels from the "device functions" for the > reason > I discussed in the previous email. That is, in the pocl point of view we > just > need a way to pick the "host-callable" kernel functions as they need the > special treatment before they can be called (like a C function). > > BTW what about the other OpenCL data like required_wg_size which > affect the possible "kernel treatment" of pocl and can be converted to some > special instructions (I suppose) for the SIMT targets? Currently only the > TCE target in Clang adds metadata for the required_wg_size kernel > attribute (as we need it in "offline compilation") but IMHO that could be > useful in general, as a default metadata (to enable its support in pocl > for all targets, for example).Ideally, we would need some standard way of representing this in Clang. The back-end would then need to convert it to whatever form the target OpenCL run-time expects. This is a question for cfe-dev.> > > -- > Pekka >-- Thanks, Justin Holewinski -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111214/c4ac389e/attachment.html>
Carlos Sánchez de La Lama
2011-Dec-14 15:45 UTC
[LLVMdev] Changes to the PTX calling conventions
Hi all,>>> I would favor calling conventions over metadata for the simple >>> reason that this maps more cleanly to the device model. Device and >>> kernel functions are represented differently in PTX, including >>> (sometimes) the way parameters are passed.>> For the record, marking the kernels with "calling conventions" >> instead of metadata is fine also for the pocl use case. It's enough >> if there is a way to differentiate OpenCL C kernels from the "device >> functions" for the reason I discussed in the previous email. That is, >> in the pocl point of view we just need a way to pick the >> "host-callable" kernel functions as they need the special treatment >> before they can be called (like a C function).Remember OpenCL kernels are also callable from inside another kernels. It is not a big deal though, as calling conventions in LLVM IR are just markers to the code generation, they do not have any effect before that (AFAIK). What it is needed is a way to differentiate at LLVM IR level between: 1) Normal functions 2) Functions callable from outside and inside (OpenCL kernels would fall in this category). 3) Functions callable only from outside (I there is such case; I am not so familiar with CUDA so I do not know if such functions exist on CUDA). At least 1 and 2 are needed for OpenCL. Whether this is calling conventions, metadata, or attributes, do not make such a big difference, in practical terms. Code generation can apply different calling conventions based on metadata/attributes, and can also detect the kernels based on calling conventions, so the options are interchangeable.>> BTW what about the other OpenCL data like required_wg_size >> affect the possible "kernel treatment" of pocl and can be converted >> to some special instructions (I suppose) for the SIMT targets? >> Currently only the TCE target in Clang adds metadata for the >> required_wg_size kernel attribute (as we need it in "offline >> compilation") but IMHO that could be useful in general, as a default >> metadata (to enable its support in pocl for all targets, for >> example).> Ideally, we would need some standard way of representing this in > Clang. The back-end would then need to convert it to whatever form > the target OpenCL run-time expects.This is an interesting point. And there might be more information present on .cl files that needs to get transported into LLVM IR. While there has been the argument around that OpenCL "is C" so clang should not need to generate extra stuff for OpenCL input files, the fact is that it is not plain C. Basically there are two ways to go on: a) OpenCL is a C-based language (C plus additions) and clang can parse it, so *all* the information on the .cl file has to be present in LLVM IR. b) OpenCL is just C, so clang does not need to care about extra things and implementations should parse .cl files to get the extra information, and potentially preprocess to transform the non-C constructs into valid C code. Just staying in between is good for nothing. An given clang has a CL mode already (-x cl) recognizes the keywords and supports the non-C in OpenCL (like vector swizzle), I think (b) can be discarded right away. But then all the info should get in a generic way into the LLVM.> This is a question for cfe-dev.So adding cfe-dev in copy. BR Carlos