thr3ads.net - search: "pacxx"

Displaying 9 results from an estimated 9 matches for "pacxx".

Did you mean: pace

[RFC] Upstreaming PACXX (Programing Accelerators with C++)

2018 Feb 05

[RFC] Upstreaming PACXX (Programing Accelerators with C++)

...f development, various talks on LLVM-HPC and EuroLLVM and other scientific conferences I want to present my PhD research topic to the lists. The main goal for my research was to develop a single-source programming model equal to CUDA or SYCL for accelerators supported by LLVM (e.g., Nvidia GPUs). PACXX uses Clang as front-end for code generation and comes with a runtime library (PACXX-RT) to execute kernels on the available hardware. Currently, PACXX supports Nvidia GPUs through the NVPTX Target and CUDA, CPUs through MCJIT (including whole function vectorization thanks to RV [1]) and has an expe...

[RFC] Upstreaming PACXX (Programing Accelerators with C++)

2018 Feb 05

[RFC] Upstreaming PACXX (Programing Accelerators with C++)

...ed by compile time reflection so that calling kernels is done by symbol. What kind of performance difference do you see running code that was not developed with GPU in mind (e.g. range-v3) vs code that was? What restrictions do you apply? I assume virtual functions, recursion. What else? How does pacxx's SPMD model differ from what one can do in LLVM at the moment? Nic [1]: http://github.com/libmir/dcompute/ > On 5 Feb 2018, at 7:11 am, Haidl, Michael via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > HI LLVM comunity, > > after 3 years of development, various tal...

[RFC] Upstreaming PACXX (Programing Accelerators with C++)

2018 Feb 05

[RFC] Upstreaming PACXX (Programing Accelerators with C++)

...ection so that calling kernels is done by symbol. > > What kind of performance difference do you see running code that was not developed with GPU in mind (e.g. range-v3) vs code that was? > What restrictions do you apply? I assume virtual functions, recursion. What else? > > How does pacxx's SPMD model differ from what one can do in LLVM at the moment? > > Nic > > [1]: http://github.com/libmir/dcompute/ > >> On 5 Feb 2018, at 7:11 am, Haidl, Michael via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> HI LLVM comunity, >> >> af...

LLVM Weekly - #215, Feb 12th 2018

2018 Feb 12

LLVM Weekly - #215, Feb 12th 2018

...ware of superword-level parallelism (SLP) operations. * David Li has written a short [summary](http://lists.llvm.org/pipermail/llvm-dev/2018-February/120964.html) of developments in profile-guided optimisation with LLVM over the past 18 months. * Michael Haidl started a discussion on [upstreaming PACXX](http://lists.llvm.org/pipermail/llvm-dev/2018-February/120965.html). "he main goal for my research was to develop a single-source programming model equal to CUDA or SYCL for accelerators supported by LLVM (e.g., Nvidia GPUs). PACXX uses Clang as front-end for code generation and comes with a...

[RFC] A New Divergence Analysis for LLVM

2018 May 28

[RFC] A New Divergence Analysis for LLVM

...Region Vectorizer (github [1]) is an analysis and transformation framework for outer-loop and whole-function vectorization. RV vectorizes arbitrary, reducible control flow including nested divergent loops through partial control-flow linearization [2]. RV is being used by the Impala [7] and the PACXX [3,8] high performance programming frameworks and implements OpenMP #pragma omp simd and #pragma omp declare simd. -- The Divergence Analysis -- The Divergence Analysis determines how instructions will behave if executed in lockstep for multiple threads or vector lanes. The loop vectorizer (V...

[AMDGPU] Strange results with different address spaces

2017 Dec 05

[AMDGPU] Strange results with different address spaces

Hi dev list, I am currently exploring the integration of AMDGPU/ROCm into the PACXX project and observing some strange behavior of the AMDGPU backend. The following IR is generated for a simple address space test that copies from global to shared memory and back to global after a barrier synchronization. Here is the IR is attached as as1.ll The output is as follows: 0 0 0 0 0...

[AMDGPU] Strange results with different address spaces

2017 Dec 05

[AMDGPU] Strange results with different address spaces

...il.com> wrote: > > > >> On Dec 5, 2017, at 02:51, Haidl, Michael via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> Hi dev list, >> >> I am currently exploring the integration of AMDGPU/ROCm into the PACXX project and observing some strange behavior of the AMDGPU backend. The following IR is generated for a simple address space test that copies from global to shared memory and back to global after a barrier synchronization. >> >> Here is the IR is attached as as1.ll >> >> T...

[LLVMdev] Emitting IR in older formats (for NVVM)

2015 Jan 13

[LLVMdev] Emitting IR in older formats (for NVVM)

Since SPIR can be (easily) transformed to NVVM IR at least for me this helps a lot. Thank you Tobias. -MH On January 12, 2015, Tobias Grosser <tgrosser at inf.ethz.ch> wrote: > On 12.01.2015 05:48, Jonathan Ragan-Kelley wrote: > > This question is specifically motivated by the practical constraints of > > NVVM, but I don't know anywhere better to ask (hopefully, e.g.,

[LLVMdev] Emitting IR in older formats (for NVVM)

2015 Jan 12

[LLVMdev] Emitting IR in older formats (for NVVM)

This question is specifically motivated by the practical constraints of NVVM, but I don't know anywhere better to ask (hopefully, e.g., @jholewinski is still following), and I believe it concerns general LLVM issues: NVIDIA's libNVVM is built on LLVM 3.2. This means its bitcode and LL text parsers are from that generation. It's interface calls for adding modules as either bitcode

search for: pacxx