Jakub (Kuba) Kuderski via llvm-dev
2021-Jan-22 17:00 UTC
[llvm-dev] LLVM GPU News Issue #4, January 22 2021
Hi folks, The fourth issue of LLVM GPU News, a bi-weekly newsletter on all the GPU things under the LLVM umbrella, is now available at: https://llvm-gpu-news.github.io/2021/01/22/issue-4.html This release a new section dedicated to OpenMP Target Offloading, maintained by Johannes Doerfert, while Lei Zhang took on reporting MLIR/SPIR-V news. I'm also pasting the content below, in case you prefer to read in your email client. -Jakub ===================================================================== # LLVM GPU News Issue #4, January 22 2021 Welcome to the fourth issue of LLVM GPU News, a bi-weekly newsletter on all the GPU things under the LLVM umbrella. This issue covers the period from January 8 to January 21 2020. LLVM GPU News gained a new section: OpenMP Target Offloading, maintained by Johannes Doerfert. We welcome your feedback and suggestions. Let us know if we missed anything interesting, or want us to bring attention to your (sub)project, revisions under review, or proposals. Please see the bottom of the page for details on how to submit suggestions and contribute. ## Industry News and Conference Talks * Dmitrii Tolmachev published a blog post on [real-time image registration on GPU with VkFFT]( https://towardsdatascience.com/real-time-image-registration-on-gpu-with-vkfft-library-c4e47f8050a0) -- a self-made [Vulkan Fast Fourier Transform library]( https://github.com/DTolm/VkFFT). Image registration is the problem of determining what coordinate system transformation to apply to an image in order to match it against a different image of the same object. Using a highly-optimized FFT implementation on a commodity GPU (Nvidia 1660Ti) allowed Dimitri to run the image registration algorithm in real time. Matching a pair of 1024x1024 screenshots from Cyberpunk 2077 took around 3ms. The readme on Dimitrii's GitHub mentions that they are looking for a PhD position or a job. ## LLVM and Clang ### Discussions * The AMDGPU backend is no longer the blocker for switching to the New Pass Manager. The last failing test was [pinned to use the Legacy Pass Manager](https://reviews.llvm.org/D95051), while the work on making Divergence Analysis work with the New Pass Manager [is still in progress]( https://lists.llvm.org/pipermail/llvm-dev/2021-January/147946.html). * Burlen Loring asked about [Clang/LLVM and CUDA version compatibility on Fedora](https://lists.llvm.org/pipermail/cfe-dev/2021-January/067532.html). There are no replies as of writing. ### Commits * (In-review) Add [AMDGPU lower function LDS pass]( https://reviews.llvm.org/D94648). The strategy is to create a new struct with a field for each LDS variable and allocate that struct at the same address for every kernel. This allows some OpenMP kernels for AMDGPU to work with the deviceRTL runtime library that uses CUDA shared variables from functions that cannot be inlined. * `AMDGPUSubtargets.h` was split into two subtargets: [`R600Subtarget.h` and `GCNSubtarget.h`](https://reviews.llvm.org/D95036). This reduces include dependencies and improves LLVM build times. * (In-review) [Implement HIP codegen support for the `__managed__` attribute.](https://reviews.llvm.org/D94814) This attribute can be applied to global variables. Managed variables can be used by both device and host code. The ROCm programming guide [mentions managed variables as not supported]( https://rocmdocs.amd.com/en/latest/Programming_Guides/HIP-GUIDE.html#variable-type-qualifiers) and does not describe their semantics yet. ## MLIR ### Discussions * Lenny Guo [expressed intention]( https://llvm.discourse.group/t/generate-spirv-binary-from-mlir-dialect-kernels-to-run-it-on-ocl-runtime/2501/4) to work on OpenCL conversions via SPIR-V and bring up an mlir-opencl-runner. ### Commits * The SPIR-V dialect now knows traits like [`SignedOp`]( https://reviews.llvm.org/D94896), [`UnsignedOp`]( https://reviews.llvm.org/D94068), and [`UsableInSpecConstantOp`]( https://reviews.llvm.org/D94288) to process ops in these categories uniformly. * `spv.SpecConstantOperation` is fully supported now, including serialization and deserialization. ## OpenMP (Target Offloading) ### Discussions * Discussions usually happen on the mailing list (openmp-dev at lists.llvm.org) or in our weekly ["OpenMP in LLVM" meeting]( https://docs.google.com/document/d/1Tz8WFN13n7yJ-SCE0Qjqf9LmjGUw0dWO9Ts1ss4YOdg/edit?usp=sharing), feel free to join in! * The LLVM/OpenMP documentation has been [online]( https://openmp.llvm.org/docs/index.html) for a few weeks. Initial content is there but the [FAQ](https://openmp.llvm.org/docs/SupportAndFAQ.html) and other sections are still very much empty; we are looking for volunteers and topics. * The memory management APIs proposed for OpenMP 6.0, i.a., to allocate managed memory, are discussed for an (potentially opt-in) inclusion into the LLVM runtime very soon. ### Commits * The `declare mapper` API was the last part of data mapping that did not transfer source information to the runtime (location and name of the mapped objects). This was [changed now](https://reviews.llvm.org/D94806) which will cause [`LIBOMPTARGET_INFO`]( https://openmp.llvm.org/docs/design/Runtimes.html#libomptarget-info) to display plenty of useful information about mapped objects. * The [second patch](https://reviews.llvm.org/D94855) for the OpenMP target profiling allows us to trace multiple threads that are offloading concurrently. See the documentation for [`LIBOMPTARGET_PROFILE`]( https://openmp.llvm.org/docs/design/Runtimes.html#libomptarget-profile). * Support for an PTX device runtime [has been dropped]( https://reviews.llvm.org/D94725) in favor of the superior way, using an LLVM-IR device runtime. The latter is now easy to build, simply move `openmp` from the enabled projects to the enabled runtimes (see [how to build an OpenMP offloading capable compiler]( https://openmp.llvm.org/docs/SupportAndFAQ.html#q-how-to-build-an-openmp-offload-capable-compiler) ). * The `nowait` support for `omp target` directives via ["hidden helper threads"](https://tianshilei.me/wp-content/uploads/concurrent-lcpc2020.pdf) was [upstreamed](https://reviews.llvm.org/D77609). Given some problems encountered afterwards it might need to be refined slightly and might not make it for LLVM 12 after all. * (In-review) [Driver support for OpenMP offloading onto AMD GPUs.]( https://reviews.llvm.org/D94961) * (In-review) A series of patches is underway to allow building the device runtime in pure OpenMP + C++. An overview of the effort can be found [here]( https://reviews.llvm.org/D94745). * (In-review) A patch to [build the CUDA plugin without having CUDA installed](https://reviews.llvm.org/D95155) on the build machine. Together with a [CUDA free device runtime](https://reviews.llvm.org/D94745) and a pre-build selection of device runtimes for various architectures, this will allow us to enable OpenMP offloading in LLVM releases, e.g., via Linux distributions. ## External Compilers ### LLPC Lu Jiao added [support for compiling SPIR-V shaders with the linkage capability](https://github.com/GPUOpen-Drivers/llpc/pull/1110). Such library SPIR-V shaders do not have an entry function, so it is required to create a dummy entry function. ### Mesa ### SYCL -- Jakub Kuderski -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210122/1462541a/attachment-0001.html>