thr3ads.net - similar to: "[LLVMdev] Example for usage of LLVM/Clang/libclc"

Displaying 20 results from an estimated 1000 matches similar to: "[LLVMdev] Example for usage of LLVM/Clang/libclc"

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

2013 Mar 01

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

Ok, as I said, the most precise way to figure out what's wrong is to emit LLVM IR first (use clang -emit-llvm ...) and check out how it differs from working examples, for instance, nvptx regression tests. ----- Original message ----- > I'm building this with llvm-c, and accessing these intrinsics via calling > the intrinsic as if it were a function. > > class F_SREG<string

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

2013 Mar 01

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

Hi Timothy, I'm not sure what you mean by this working for other intrinsics, but in this case, I think you want the intrinsic name llvm.nvvm.read.ptx.sreg.tid.x. For me, this looks like: %x = call i32 @llvm.nvvm.read.ptx.sreg.tid.x() Pete On Fri, Mar 1, 2013 at 11:51 AM, Timothy Baldridge <tbaldridge at gmail.com> wrote: > I'm building this with llvm-c, and accessing these

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

2013 Mar 01

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

The identifier INT_PTX_SREG_TID_X is the name of an instruction as the back-end sees it, and has very little to do with the name you should use in your IR. Your best bet is to look at the include/llvm/IR/IntrinsicsNVVM.td file and see the definitions for each intrinsic. Then, the name mapping is just: int_foo_bar -> llvm.foo.bar() int_ prefix becomes llvm., and all underscores turn into

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

2013 Mar 01

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

I'm building this with llvm-c, and accessing these intrinsics via calling the intrinsic as if it were a function. class F_SREG<string OpStr, NVPTXRegClass regclassOut, Intrinsic IntOp> : NVPTXInst<(outs regclassOut:$dst), (ins), OpStr, [(set regclassOut:$dst, (IntOp))]>; def INT_PTX_SREG_TID_X : F_SREG<"mov.u32 \t$dst, %tid.x;",

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

2013 Mar 01

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

I've written a compiler that outputs PTX code, the result seems fairly reasonable, but I'm not sure the intrinsics are getting compiled correctly. In addition, when I try load the module using CUDA, I get an error: CUDA_ERROR_NO_BINARY_FOR_GPU. I'm running this on a 2012 MBP with a 640M GPU. PTX Code (for a mandelbrot calculation): // // Generated by LLVM NVPTX Back-End //

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

2013 Mar 01

[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU

Timothy, Those calls to compute grid intrinsics are definitely wrong. In ptx code they should end up into reading special registers, rather than function calls. Try to take some working example and figure out the LLVM IR differences between it and the result of your compiler. - D. ----- Original message ----- > I've written a compiler that outputs PTX code, the result seems fairly >

LIBCLC with LLVM 3.9 Trunk

2016 Apr 08

LIBCLC with LLVM 3.9 Trunk

It's not clear what is actually wrong from your original message, I think you need to give some more information as to what you are doing: Example source, what target GPU, compiler error messages or other evidence of "it's wrong" (llvm IR, disassembly, etc) ... -- Mats On 8 April 2016 at 09:55, Liu Xin via llvm-dev <llvm-dev at lists.llvm.org> wrote: > I built it

[LLVMdev] Example for usage of LLVM/Clang/libclc

2015 Feb 05

[LLVMdev] Example for usage of LLVM/Clang/libclc

Hi, > which works but it produces LLVM IR code for all OpenCL intrinsics > implemented by libclc along with the kernel I am interested in, is their a > possibility to avoid this ? and only produce the llvm code for the kernel > required ? Mark all functions apart from the kernel entry points with the internal attribute and then run global dead code elimination (it should remove most

[AMDGPU] non-hsa intrinsic with hsa target

2016 Mar 05

[AMDGPU] non-hsa intrinsic with hsa target

Hi Mr. Liu, Thanks for your quick reply. I compiled the code with the libclc_trunk and linked the bitcode file under $LIBCLC_DIR/built_libs/tahiti-amdgcn--.bc. After looking into the libclc, it is currently using the new workitem intrinsics (commit ba9858caa1e927a6fcc601e3466faa693835db5e). In the linked bitcode ($LIBCLC_DIR/built_libs/tahiti-amdgcn--.bc), it has the following code segment,

[AMDGPU] non-hsa intrinsic with hsa target

2016 Mar 05

[AMDGPU] non-hsa intrinsic with hsa target

Dear Developers, I compiled a OpenCL kernel before (on Nov. last year) like __kernel void g(__global float* array) { array[get_global_id(0)] = 1; } with libclc, which would originally use the instrinsics like llvm.r600.read.local.size.x(). I executed the generated object file with one version of the hsa-runtime [1] provided by Mr. Stellard, when there was more than one workgroup, the output

Bayesian question (problem using adapt)

2006 Nov 11

Bayesian question (problem using adapt)

In the following code I have created the posterior density for a Bayesian survival model with four parameters. However, when I try to use the adapt function to perform integration in four dimensions (on my old version of R I get an error message saying that I have applied a non-function, although the function does work when I type kernel2(param0, theta0), or on the newer version of R the computer

[LLVMdev] Build PTX samples with LLVM/Clang/libclc

2011 Dec 07

[LLVMdev] Build PTX samples with LLVM/Clang/libclc

Hi Justin, I download llvm-ptx-samples [1] and try to build them. I found it seems lack of a complete document on how to build them with LLVM/Clang/libclc. Do you think it's a good idea to put a complete document/tutorial in _one_ place? Currently, there are your website [2], LLVM [3], Clang and libclc websites [5] over there. I feel people might get lost among those websites. ;-) Here

builtins name mangling in SPIR 2.0

2016 Sep 12

builtins name mangling in SPIR 2.0

Hi all, According to the SPIR 2.0 spec[1], the name of OpenCL builtins are mangled. However, when I compile OpenCl code with Clang 3.9 with the "spir64-unknown-unknown" target, Clang generates IR without mangling the builtins, e.g. for: __kernel void input_zip_int(__global int *in0) { *in0 = get_global_id(0); } clang generates: define spir_kernel void @input_zip_int(i32

builtins name mangling in SPIR 2.0

2016 Sep 12

builtins name mangling in SPIR 2.0

Thanks a lot. On Mon, Sep 12, 2016 at 1:42 PM, Liu, Yaxun (Sam) <Yaxun.Liu at amd.com> wrote: > If you use the default header file under clang/lib/Headers/opencl-c.h, > get_global_id will be mangled. > > > > If you want to declare get_global_id in your own header, add > __attribute__((overloadable)), then it will be mangled. > > > > Sam > > > >

[LLVMdev] Build PTX samples with LLVM/Clang/libclc

2011 Dec 07

[LLVMdev] Build PTX samples with LLVM/Clang/libclc

On Tue, Dec 6, 2011 at 10:17 PM, 陳韋任 <chenwj at iis.sinica.edu.tw> wrote: > Hi Justin, > > I download llvm-ptx-samples [1] and try to build them. I found it seems > lack > of a complete document on how to build them with LLVM/Clang/libclc. Do you > think > it's a good idea to put a complete document/tutorial in _one_ place? > Currently, > there are your

builtins name mangling in SPIR 2.0

2016 Sep 16

builtins name mangling in SPIR 2.0

+ Alexey Anastasia According to SPIR spec v1.2 s2.10.3 2.10.3 The printf function The printf function is supported, and is mangled according to its prototype as follows: int printf(constant char * restrict fmt, ... ) Note that the ellipsis formal argument (...) is mangled to argument type specifier z It seems printf should be mangled. Alexey/Anastasia, What do you think? Thanks. Sam From:

builtins name mangling in SPIR 2.0

2016 Sep 18

builtins name mangling in SPIR 2.0

I don't see any problem mangling it to be honest even though there seems to be only one prototype anyways. We could add restrict in as well. Cheers, Anastasia ________________________________ From: Hongbin Zheng <etherzhhb at gmail.com> Sent: 17 September 2016 05:32:54 To: Liu, Yaxun (Sam) Cc: cfe-dev at lists.llvm.org; llvm-dev; Bader, Alexey (alexey.bader at intel.com); Anastasia

Some llvm questions (for tgsi backend)

2016 Jan 12

Some llvm questions (for tgsi backend)

Hi Tom, Thanks for taking the time to answer this. On 11-01-16 18:10, Tom Stellard wrote: > On Mon, Jan 11, 2016 at 12:07:14PM +0100, Hans de Goede wrote: >> Hi, >> >> After a few distractions I'm back to work on the llvm tgsi backend. I've >> added clang integration and I can now compile a simple opencl program >> to something which sort of looks like

[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken

2012 Jul 10

[LLVMdev] [NVPTX] CUDA inline PTX asm definitions scoping "{" "}" is broken

Hi, Looks like "{" and "}" are lost when trying to use the combination of Clang and NVPTX, which may result into clash of definitions of the function-scope and asm-scope. Here is an example: > cat test.cu __attribute__((device)) __attribute__((nv_linkonce_odr)) __inline__ int __any(int a) { int result; asm __volatile__ ("{ \n\t" ".reg .pred

[Fast Int64 1/4] Move OPUS_FAST_INT64 definition to celt/arch.h.

2015 Nov 16

[Fast Int64 1/4] Move OPUS_FAST_INT64 definition to celt/arch.h.

--- celt/arch.h | 5 +++++ silk/macros.h | 4 +--- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/celt/arch.h b/celt/arch.h index 9f74ddd..670527b 100644 --- a/celt/arch.h +++ b/celt/arch.h @@ -78,6 +78,11 @@ static OPUS_INLINE void _celt_fatal(const char *str, const char *file, int line) #define UADD32(a,b) ((a)+(b)) #define USUB32(a,b) ((a)-(b)) +/* Set this if opus_int64

similar to: [LLVMdev] Example for usage of LLVM/Clang/libclc