similar to: OpenCL toolset (for AMD GPU)

Displaying 20 results from an estimated 900 matches similar to: "OpenCL toolset (for AMD GPU)"

2015 Sep 29
2
OpenCL toolset (for AMD GPU)
On 09/29/2015 04:19 PM, Tom Stellard via llvm-dev wrote: > On Tue, Sep 29, 2015 at 01:20:57PM +0000, Paweł Bylica via llvm-dev wrote: >> Hi LLVM, >> >> I would like to compile OpenCL kernel for a specific AMD GPU target. Is it >> possible with the current clang/LLVM? >> >> I started by using `clang -x cl` but it looks like at least some OpenCL >>
2015 Feb 03
2
[LLVMdev] Example for usage of LLVM/Clang/libclc
Hi, My goal is to use Clang/LLVM/libclc to compile an OpenCL kernel and eventually generate a PTX code. I already did this but I am not sure if the PTX code I am generating is correct (is the one that is supposed to be generated). For example, currently, In OpenCL : get_global_id(0) translates to In LLVM : %call = tail call i32 @get_global_id(i32 0) which translates to In PTX:
2016 Mar 05
2
[AMDGPU] non-hsa intrinsic with hsa target
Dear Developers, I compiled a OpenCL kernel before (on Nov. last year) like __kernel void g(__global float* array) { array[get_global_id(0)] = 1; } with libclc, which would originally use the instrinsics like llvm.r600.read.local.size.x(). I executed the generated object file with one version of the hsa-runtime [1] provided by Mr. Stellard, when there was more than one workgroup, the output
2016 Jun 02
3
PTX generation from CUDA file for compute capability 1.0 (sm_10)
Hello Bergström/Eric, Thanks for the reply. The G80(sm_10) architecture was ported on FPGA by a group of researchers (http://www.ecs.umass.edu/ece/tessier/andryc-fpt13.pdf). Our group have some further research interest on this work. I was working on modifying the Clang-LLVM for a couple of months and achieved the required changes. But Clang-LLVM is only allowing me to generate PTX for sm_20,
2016 Mar 05
2
[AMDGPU] non-hsa intrinsic with hsa target
Hi Mr. Liu, Thanks for your quick reply. I compiled the code with the libclc_trunk and linked the bitcode file under $LIBCLC_DIR/built_libs/tahiti-amdgcn--.bc. After looking into the libclc, it is currently using the new workitem intrinsics (commit ba9858caa1e927a6fcc601e3466faa693835db5e). In the linked bitcode ($LIBCLC_DIR/built_libs/tahiti-amdgcn--.bc), it has the following code segment,
2016 Jun 02
5
PTX generation from CUDA file for compute capability 1.0 (sm_10)
Hello, When generating the PTX output from CUDA file(.cu file), the minimum target that is accepted by LLVM is sm_20. But I have a specific requirement to generate PTX output for compute capability 1.0 (sm_10). Is there any previous version of LLVM supporting this? Thank you, Ginu -------------- next part -------------- An HTML attachment was scrubbed... URL:
2012 Feb 27
2
[LLVMdev] Alias in LLVM 3.0
We use alias extensively in our library to support OpenCL generating code for both our CPUs and GPUs. During the transition to LLVM 3.0 with the new type system, we're seeing two problems. Both involve type conversions occurring across an alias. In one case, one of the types is pointer to an opaque type, and ends up creating an assert in the verifier where it is checking that argument types
2017 May 08
2
[OpenCL][AMDGPU] Using AMDGPU generated kernel code for OpenCL
Hello everyone I was wondering, what the correct way of using an AMDGPU generated kernel code for OpenCL was. I am trying to provide Polly's GPGPU Code generation with the ability to run on different GPU devices, such as AMD GPUs. For NVIDIA, I simply retrieve a pre-compiled PTX string from the NVPTX backend and pass that to OpenCL's 'clCreateProgramWithBinary' function. However,
2019 Jun 25
2
x86 instructions EFLAGS in TableGen
Hello, Here is one question regarding the LLVM TableGen: Which file in the llvm/lib/Target/X86 folder describes how the bits in the EFLAGS register are modified by the x86 instructions? For example, in the "X86InstrInfo.td" file, lines 2134-2135, it says: let SchedRW = [WriteALU], Defs = [EFLAGS], Uses = [EFLAGS] in { def CLC : I<0xF8, RawFrm, (outs), (ins), "clc",
2019 Sep 09
2
LiveInterval error with 2 dead defs
Hi, I’m hitting a machine verifier error in a trivial testcase which I don’t understand. There are 2 dead defs of the same register: --- name: multiple_connected_compnents_dead tracksRegLiveness: true body: | bb.0: dead %0:vgpr_32 = V_MOV_B32_e32 0, implicit $exec dead %0:vgpr_32 = V_MOV_B32_e32 1, implicit $exec ... The live intervals look OK to me with 1 valno
2009 Aug 20
3
[PATCH ovirt-node-image] fixes for edit-livecd
Patch set fixes issues with image size increase when using edit-livecd Also address issue with ext4 root fs
2017 Sep 06
4
post_processor in rmarkdown not working
Dear all, I'm trying to write a post_processor() for a custom rmarkdown format. The goal of the post_processor() is to modify the latex file before it is compiled. For some reason the post_processor() is not run. The post_processor() does work when I run it manually on the tex file. Any suggestions on what I'm doing wrong? Below is the relevant snippet of the code. The full code is
2006 Jul 05
2
Protecting Static content
Hi, I want to build a rails backed site which, in addition to some dynamic content, also comprises a number of static content files. There are some static html pages, some powerpoint presentations, and some PDF documents. I want to make sure that the user is logged in before they can access the protected content. I''ve gone through the ''Agile development with
2017 Sep 07
3
post_processor in rmarkdown not working
Dear Duncan, Thanks for chiming in. Could you explain how you set debug() on post_processor()? I've tried adding debug(post_processor) to rsos_article() or adding debug(post_processor) when after post_processor was defined in the debugger. Neither work for me. All supporting files are available within the package. The code below should be reproducible on your machine.
2014 Dec 15
2
(no subject)
Hi, I noticed the diff wasn't showing the "-" at the start of a deleted file and the details of the second file when attributes are different. This change should fix them. thanks.
2011 Oct 19
5
[LLVMdev] ANN: libclc (OpenCL C library implementation)
Hi, This is to announce the availability of libclc, an open source, BSD licensed implementation of the library requirements of the OpenCL C programming language, as specified by the OpenCL 1.1 Specification. libclc is intended to be used with Clang's OpenCL frontend. libclc website: http://www.pcc.me.uk/~peter/libclc/ libclc is designed to be portable and extensible. To this end, it
2011 Oct 19
6
[LLVMdev] ANN: libclc (OpenCL C library implementation)
Hi everybody, the compiler design lab at Saarland University (chair of Sebastian Hack) is also working on an LLVM-based OpenCL driver. The project started as a use-case for our "Whole-Function Vectorization" library, which allows to transform a function to compute the same as W executions of the original code by using SIMD instructions (W = 4 for SSE/AltiVec, 8 for AVX). The
2017 Sep 07
0
post_processor in rmarkdown not working
Are you sure that you want to read in the output_file in text <- readLines(output_file, warn = FALSE)? best regards, Heinz Thierry Onkelinx wrote/hat geschrieben on/am 06.09.2017 11:41: > Dear all, > > I'm trying to write a post_processor() for a custom rmarkdown format. The > goal of the post_processor() is to modify the latex file before it is > compiled. For some
2018 Sep 05
4
Can I control HSA config generated by AMDGPU backend?
Finally I kind of modified llvm to generate assembly that can run on AMDGPU pro drivers. One problem is the performance of the code generated by llvm is about 10% slower than amdgpu's online compiler. Anything I can tune the performance up the performance of llvm?\ Thanks! On Tue, Sep 4, 2018 at 9:23 AM 董昌道 <dongchangdao at gmail.com> wrote: > I am writing a miner of crypto
2017 Sep 07
2
post_processor in rmarkdown not working
On 07/09/2017 2:04 PM, Duncan Murdoch wrote: > On 07/09/2017 10:11 AM, Thierry Onkelinx wrote: >> Dear Duncan, >> >> Thanks for chiming in. Could you explain how you set debug() on >> post_processor()? I've tried adding debug(post_processor) to >> rsos_article() or adding debug(post_processor) when after post_processor >> was defined in the debugger.