Ralf Karrenberg
2011-Oct-20 10:45 UTC
[LLVMdev] ANN: libclc (OpenCL C library implementation)
Hi Carlos, On 10/20/11 9:54 AM, Carlos Sánchez de La Lama wrote:>> The project started as a use-case for our "Whole-Function Vectorization" >> library, which allows to transform a function to compute the same as W >> executions of the original code by using SIMD instructions (W = 4 for >> SSE/AltiVec, 8 for AVX). > > Quite interesting. We were planning to add "vectorization" to our passes > also, but if I understood the paper correctly your approach uses full > speculation, which is all right for SIMD architectures but might not be > so for multi-issue processors.I don't know what you mean with "speculation" here, but other than that you are right: for best performance, we explicitly target machines with SIMD instruction sets.>> In contrast to Clover and pocl, we aimed at maximum performance before >> full support of the API (which simply requires more manpower than one >> PhD student). > > That is wrong, at least for pocl. We do not (by far) support the whole > API, the main new point on pocl is the LLVM passes to statically create > the different work items in a workgroup, and the barrier handling. Our > kernel runtime library is currently in fact fairly small, including just > a little more then the implementation-dependent functions. We are > considering merging efforts with liblcl in that point.Please excuse me for getting that wrong. I think we should really stick our heads together (also including Denis Steckelmacher who implemented Clover) and somehow combine all our efforts. Otherwise, we will probably just all solve the same problems in parallel. Additionally, no user will gain anything if he has to decide between multiple, half-baked solutions. Best, Ralf>> Am 19.10.2011 17:38, schrieb Hal Finkel: >>> Do we have a list of these open-source LLVM-based OpenCL projects >>> somewhere? Off the top of my head, we have: >>> >>> libclc: http://www.pcc.me.uk/~peter/libclc/ >>> pocl: https://launchpad.net/pocl >>> clover: http://cgit.freedesktop.org/~steckdenis/clover/ >>> >>> (I think that all of these have BSD- or MIT-style licenses). >>> >>> Are there any others? >>> >>> -Hal >>> >>> On Wed, 2011-10-19 at 14:47 +0100, Peter Collingbourne wrote: >>>> Hi, >>>> >>>> This is to announce the availability of libclc, an open source, BSD >>>> licensed implementation of the library requirements of the OpenCL C >>>> programming language, as specified by the OpenCL 1.1 Specification. >>>> libclc is intended to be used with Clang's OpenCL frontend. >>>> >>>> libclc website: http://www.pcc.me.uk/~peter/libclc/ >>>> >>>> libclc is designed to be portable and extensible. To this end, >>>> it provides generic implementations of most library requirements, >>>> allowing the target to override the generic implementation at the >>>> granularity of individual functions. >>>> >>>> libclc currently only supports the PTX target, but support for more >>>> targets is welcome. >>>> >>>> How does this project relate to the recently announced Portable OpenCL >>>> (POCL) project? Unlike POCL, this project is not intended to provide >>>> an OpenCL host library (i.e. the OpenCL Platform Layer and OpenCL >>>> Runtime specified in sections 4-5 of the OpenCL specification). >>>> Instead, it provides only the requirements for the OpenCL C >>>> Programming Language (section 6 et seq). It is intended to be used >>>> with an existing host library implementation, and comply with its >>>> ABI requirements. >>>> >>>> An example of such a host library is NVIDIA's OpenCL host library >>>> for PTX -- the intention is to at some point provide a mechanism >>>> for using the NVIDIA implementation of OpenCL with Clang, libclc >>>> and LLVM's PTX backend instead of NVIDIA's own OpenCL compiler. >>>> Another example would be POCL's host library, and the POCL developers >>>> have expressed an interest in using libclc as their OpenCL C library >>>> instead of developing their own. >>>> >>>> I will hope to find time over the next few weeks to add libclc support >>>> to the Clang driver. The intention is that compiling OpenCL C programs >>>> to PTX would be as easy as (something like this): >>>> >>>> clang -target ptx32 -S file.cl >>>> >>>> such that the driver would automatically locate the libclc headers, >>>> add them to the include path and pre-include the main header file. >>>> (The libclc support will of course be optional, and a -cl-stdlib>>>> flag will be provided to allow for switching between OpenCL standard >>>> library implementations.) >>>> >>>> Thanks, >>> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu> http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
Denis Steckelmacher
2011-Oct-20 17:42 UTC
[LLVMdev] Re : ANN: libclc (OpenCL C library implementation)
Hello, I am the developer of Clover, and so much activity about OpenCL these days is really exciting. Here is my point of view, mainly on Clover and how the projects could use each other. Clover is made in a way that allow a certain level of modularity. Although POCL would be very difficult to merge into Clover (or Clover into POCL), as these two projects are nearly exactly doing the same things (an OpenCL platform layer), Libclc and the German driver are very interesting. I'll begin with the German driver, maybe the project the easiest to integrate or bind to Clover. Clover uses a hardware abstraction layer, a set of virtual classes a driver has to implement in order to be usable by Clover. They are DeviceInterface, DeviceBuffer, DeviceKernel and DeviceProgram. You can see their prototype here : http://cgit.freedesktop.org/~steckdenis/clover/tree/src/core/deviceinterface.h . Currently, I have developed a small driver, normally not too slow, that simply runs the LLVM IR produced by Clang using the LLVM JIT. The kernels are split in work-groups, split in work-items in such a way that multithreading is efficiently used. What would be interesting is to try to integrate the German driver into Clover using this interface, or to have this driver built as a library on which Clover links (if there is a problem of license, but Clover is BSD and it seems that the driver will be the same). I would personally be very excited to see how another driver would perform in Clover, feature-wise and performance-wise. Libclc could also be useful to Clover, but less likely. The goal of this project is to implement all the OpenCL built-in functions. It's good, but Clover already does the same, using a different technique. Libclc is very elegant (I think), it seems to use custom LLVM intrinsics, and is built around pure C macros. Clover uses a slightly more complex system, involving a Python script "compiling" a set of built-ins into four files. For example, this declaration (REPL is a macro that does a simple for()) : ---- def vecf : float2 float3 float4 float8 float16 native $type acospi $vecf : x:$type REPL($vecdim) result[i] = std::acos(x[i]) / M_PI; end ---- Is compiled to these fragments, one for each vector type (float2, float3, etc) : ---- // In stdlib_def.h : what the OpenCL C kernel sees float2 OVERLOAD acospi(float2 x); // In stdlib_impl.h : what gets compiled to LLVM IR at Clover compile time, and then linked to each kernel void __cpu_float2_acospi_float2(float *result, float *x); float2 OVERLOAD acospi(float2 x) { float2 result; __cpu_float2_acospi_float2((float *)&result, (float *)&x); return result; } // __cpu_float2_acospi_float2 is a function implemented in the Clover .so library, using llvm::JIT::registerLasyFunctionCreator // In builtins_impl.h : the actual C++ implementation, included in src/core/cpu/builtins.cpp static void float2_acospi_float2(float *result, float *x) { REPL(2) result[i] = std::acos(x[i]) / M_PI; } // And then a small else if in the lazy function creator, in order to bind everything together else if (name == "__cpu_float2_acospi_float2") return (void *)&float2_acospi_float2; ---- The system works fairly well, and I was able to implement a dozen of built-in functions in only two hours. It's very fast to simply declare "native" functions using STL or Boost math functions, and hardware drivers simply can replace the LLVM "call" statements with what they need to accelerate the functions on the GPU. So, libclc would only be useful to Clover if it is developed by so much people that its development becomes way faster than Clover, and if it provides an easy and efficient way to natively implement functions, without needing to have a LLVM pass turning LLVM intrinsics to native function calls. Here is my personal point of view, and I hope a solution will be found not to have three or four different projects working on the same things. Best regards, Denis Steckelmacher.
Gregory Junker
2011-Oct-20 17:50 UTC
[LLVMdev] ANN: libclc (OpenCL C library implementation)
> Additionally, no user will gain anything if he has to decide > between multiple, half-baked solutions.Truer words were never spoken. ;) Greg
Justin Holewinski
2011-Oct-20 19:24 UTC
[LLVMdev] Re : ANN: libclc (OpenCL C library implementation)
On Thu, Oct 20, 2011 at 1:42 PM, Denis Steckelmacher <steckdenis at yahoo.fr>wrote:> Hello, > > I am the developer of Clover, and so much activity about OpenCL these days > is really exciting. Here is my point of view, mainly on Clover and how the > projects could use each other. > > Clover is made in a way that allow a certain level of modularity. Although > POCL would be very difficult to merge into Clover (or Clover into POCL), as > these two projects are nearly exactly doing the same things (an OpenCL > platform layer), Libclc and the German driver are very interesting. > > I'll begin with the German driver, maybe the project the easiest to > integrate or bind to Clover. Clover uses a hardware abstraction layer, a set > of virtual classes a driver has to implement in order to be usable by > Clover. They are DeviceInterface, DeviceBuffer, DeviceKernel and > DeviceProgram. You can see their prototype here : > http://cgit.freedesktop.org/~steckdenis/clover/tree/src/core/deviceinterface.h. Currently, I have developed a small driver, normally not too slow, that > simply runs the LLVM IR produced by Clang using the LLVM JIT. The kernels > are split in work-groups, split in work-items in such a way that > multithreading is efficiently used. > > What would be interesting is to try to integrate the German driver into > Clover using this interface, or to have this driver built as a library on > which Clover links (if there is a problem of license, but Clover is BSD and > it seems that the driver will be the same). I would personally be very > excited to see how another driver would perform in Clover, feature-wise and > performance-wise. > > Libclc could also be useful to Clover, but less likely. The goal of this > project is to implement all the OpenCL built-in functions. It's good, but > Clover already does the same, using a different technique. Libclc is very > elegant (I think), it seems to use custom LLVM intrinsics, and is built > around pure C macros.libclc only uses LLVM intrinsics (currently) for back-end specific functionality. For example, the get_local_id() function is implemented separately for each target, and uses LLVM PTX intrinsics if compiling for the PTX back-end. This is not something you could implement in a generic way without back-end hooks (at least not without dirty hacks in the back-end).> Clover uses a slightly more complex system, involving a Python script > "compiling" a set of built-ins into four files. For example, this > declaration (REPL is a macro that does a simple for()) : >> ---- > def vecf : float2 float3 float4 float8 float16 > > native $type acospi $vecf : x:$type > REPL($vecdim) > result[i] = std::acos(x[i]) / M_PI; > end > ---- > > Is compiled to these fragments, one for each vector type (float2, float3, > etc) : > > ---- > // In stdlib_def.h : what the OpenCL C kernel sees > float2 OVERLOAD acospi(float2 x); > > // In stdlib_impl.h : what gets compiled to LLVM IR at Clover compile time, > and then linked to each kernel > void __cpu_float2_acospi_float2(float *result, float *x); > float2 OVERLOAD acospi(float2 x) > { > float2 result; > > __cpu_float2_acospi_float2((float *)&result, (float *)&x); > > return result; > } > > // __cpu_float2_acospi_float2 is a function implemented in the Clover .so > library, using llvm::JIT::registerLasyFunctionCreator > // In builtins_impl.h : the actual C++ implementation, included in > src/core/cpu/builtins.cpp > static void float2_acospi_float2(float *result, float *x) > { > REPL(2) > result[i] = std::acos(x[i]) / M_PI; > > } > > // And then a small else if in the lazy function creator, in order to bind > everything together > else if (name == "__cpu_float2_acospi_float2") > return (void *)&float2_acospi_float2; > ---- >If the LLVM JIT picks up these functions at run-time, then there is no chance of inlining these math functions. This is not good for performance.> > The system works fairly well, and I was able to implement a dozen of > built-in functions in only two hours. It's very fast to simply declare > "native" functions using STL or Boost math functions, and hardware drivers > simply can replace the LLVM "call" statements with what they need to > accelerate the functions on the GPU. >But then the hardware driver layer has to have GPU implementations for all of these functions.> > So, libclc would only be useful to Clover if it is developed by so much > people that its development becomes way faster than Clover, and if it > provides an easy and efficient way to natively implement functions, without > needing to have a LLVM pass turning LLVM intrinsics to native function > calls. > > Here is my personal point of view, and I hope a solution will be found not > to have three or four different projects working on the same things. > > Best regards, > Denis Steckelmacher. > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-- Thanks, Justin Holewinski -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111020/da907596/attachment.html>
Marc J. Driftmeyer
2011-Oct-21 03:28 UTC
[LLVMdev] ANN: libclc (OpenCL C library implementation)
I was hoping you would come to the collaborative, joint solution. I've been waiting for Clang to have a settled OpenCL implementation to start working on OpenCL. Dealing with 3 parallel projects would be just that, a pain in the rear. - Marc On 10/20/2011 03:45 AM, Ralf Karrenberg wrote:> Hi Carlos, > > On 10/20/11 9:54 AM, Carlos Sánchez de La Lama wrote: >>> The project started as a use-case for our "Whole-Function Vectorization" >>> library, which allows to transform a function to compute the same as W >>> executions of the original code by using SIMD instructions (W = 4 for >>> SSE/AltiVec, 8 for AVX). >> Quite interesting. We were planning to add "vectorization" to our passes >> also, but if I understood the paper correctly your approach uses full >> speculation, which is all right for SIMD architectures but might not be >> so for multi-issue processors. > I don't know what you mean with "speculation" here, but other than that > you are right: for best performance, we explicitly target machines with > SIMD instruction sets. > >>> In contrast to Clover and pocl, we aimed at maximum performance before >>> full support of the API (which simply requires more manpower than one >>> PhD student). >> That is wrong, at least for pocl. We do not (by far) support the whole >> API, the main new point on pocl is the LLVM passes to statically create >> the different work items in a workgroup, and the barrier handling. Our >> kernel runtime library is currently in fact fairly small, including just >> a little more then the implementation-dependent functions. We are >> considering merging efforts with liblcl in that point. > Please excuse me for getting that wrong. > > I think we should really stick our heads together (also including Denis > Steckelmacher who implemented Clover) and somehow combine all our efforts. > Otherwise, we will probably just all solve the same problems in > parallel. Additionally, no user will gain anything if he has to decide > between multiple, half-baked solutions. > > Best, > Ralf > > > >>> Am 19.10.2011 17:38, schrieb Hal Finkel: >>>> Do we have a list of these open-source LLVM-based OpenCL projects >>>> somewhere? Off the top of my head, we have: >>>> >>>> libclc: http://www.pcc.me.uk/~peter/libclc/ >>>> pocl: https://launchpad.net/pocl >>>> clover: http://cgit.freedesktop.org/~steckdenis/clover/ >>>> >>>> (I think that all of these have BSD- or MIT-style licenses). >>>> >>>> Are there any others? >>>> >>>> -Hal >>>> >>>> On Wed, 2011-10-19 at 14:47 +0100, Peter Collingbourne wrote: >>>>> Hi, >>>>> >>>>> This is to announce the availability of libclc, an open source, BSD >>>>> licensed implementation of the library requirements of the OpenCL C >>>>> programming language, as specified by the OpenCL 1.1 Specification. >>>>> libclc is intended to be used with Clang's OpenCL frontend. >>>>> >>>>> libclc website: http://www.pcc.me.uk/~peter/libclc/ >>>>> >>>>> libclc is designed to be portable and extensible. To this end, >>>>> it provides generic implementations of most library requirements, >>>>> allowing the target to override the generic implementation at the >>>>> granularity of individual functions. >>>>> >>>>> libclc currently only supports the PTX target, but support for more >>>>> targets is welcome. >>>>> >>>>> How does this project relate to the recently announced Portable OpenCL >>>>> (POCL) project? Unlike POCL, this project is not intended to provide >>>>> an OpenCL host library (i.e. the OpenCL Platform Layer and OpenCL >>>>> Runtime specified in sections 4-5 of the OpenCL specification). >>>>> Instead, it provides only the requirements for the OpenCL C >>>>> Programming Language (section 6 et seq). It is intended to be used >>>>> with an existing host library implementation, and comply with its >>>>> ABI requirements. >>>>> >>>>> An example of such a host library is NVIDIA's OpenCL host library >>>>> for PTX -- the intention is to at some point provide a mechanism >>>>> for using the NVIDIA implementation of OpenCL with Clang, libclc >>>>> and LLVM's PTX backend instead of NVIDIA's own OpenCL compiler. >>>>> Another example would be POCL's host library, and the POCL developers >>>>> have expressed an interest in using libclc as their OpenCL C library >>>>> instead of developing their own. >>>>> >>>>> I will hope to find time over the next few weeks to add libclc support >>>>> to the Clang driver. The intention is that compiling OpenCL C programs >>>>> to PTX would be as easy as (something like this): >>>>> >>>>> clang -target ptx32 -S file.cl >>>>> >>>>> such that the driver would automatically locate the libclc headers, >>>>> add them to the include path and pre-include the main header file. >>>>> (The libclc support will of course be optional, and a -cl-stdlib>>>>> flag will be provided to allow for switching between OpenCL standard >>>>> library implementations.) >>>>> >>>>> Thanks, >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu<mailto:LLVMdev at cs.uiuc.edu> http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-- Marc J. Driftmeyer Email :: mjd at reanimality.com <mailto:mjd at reanimality.com> Web :: http://www.reanimality.com Cell :: (509) 435-5212 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111020/7832e65a/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: mjd.vcf Type: text/x-vcard Size: 317 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111020/7832e65a/attachment.vcf>
Pekka Jääskeläinen
2011-Oct-21 07:51 UTC
[LLVMdev] ANN: libclc (OpenCL C library implementation)
On 10/21/2011 06:28 AM, Marc J. Driftmeyer wrote:> I was hoping you would come to the collaborative, joint solution. I've been > waiting for Clang to have a settled OpenCL implementation to start working on > OpenCL.We at pocl are looking forward for the German code to be released as it seems to be closest to our interests of providing a performance portable OpenCL implementation. Hopefully there will be some co-operation with that. Meanwhile we keep on working on our code base to make it fulfill our research goals in the TCE project, e.g. add multithreading at WG level, more flexible work item parallelization options, more API functions to run more benchmarks, etc.> Dealing with 3 parallel projects would be just that, a pain in the rear.Just pick your favorite and contribute. I'm sure there will be merging of efforts at some point and your work doesn't go wasted. -- --Pekka
Maybe Matching Threads
- [LLVMdev] ANN: libclc (OpenCL C library implementation)
- [LLVMdev] ANN: libclc (OpenCL C library implementation)
- [LLVMdev] ANN: libclc (OpenCL C library implementation)
- [LLVMdev] ANN: libclc (OpenCL C library implementation)
- [LLVMdev] ANN: libclc (OpenCL C library implementation)