search for: amdhsa

Displaying 15 results from an estimated 15 matches for "amdhsa".

Did you mean: amd's
2017 May 08
2
[OpenCL][AMDGPU] Using AMDGPU generated kernel code for OpenCL
...ss that to OpenCL's 'clCreateProgramWithBinary' function. However, when doing the same with the AMDGPU backend and its returned kernel string, OpenCL complains about an invalid binary. This has been tried with a number of different target triples (eg. 'amdgcn--', 'amdgcn-amd-amdhsa' etc), and my assumption so far is, that I am not trying the correct Triple. Or am I missing something entirely, and there have to be additional steps, to get the correct ELF binary? Thank you in advance for any help and pointers! Best, Philipp -------------- next part -------------- An HTML...
2018 Sep 05
4
Can I control HSA config generated by AMDGPU backend?
...> > > > Thanks, > > Changdao > > > > On Mon, Sep 3, 2018 at 5:25 AM Tamazov, Artem <Artem.Tamazov at amd.com> > wrote: > > Hello, > > > > Please look into https://llvm.org/docs/AMDGPUUsage.html. > > > > > My target is amdgpu--amdhsa. > > > > This means that the kernel(s) are to be executed on HSA compatible > runtimes such as AMD’s ROCm. > > > > > ..."enable_sgpr_dispatch_ptr = 1". Can I do something to turn that off > in the generated assembly file? > > > ...user argument is...
2015 Sep 29
2
OpenCL toolset (for AMD GPU)
..., > > You need to include OpenCL library headers from libclc > (http://libclc.llvm.org/) to compile most OpenCL code. > > Here is an example command: > > clang -include /path/to/libclc/headers/clc.h -I /path/to/libclc/headers -Dcl_clang_storage_class_specifiers -target amdgcn--amdhsa -mcpu=carrizo $INPUT_FILE -o $OUTPUT_FILE Hi Tom, to piggy-pack on this question. To load this kernel in OpenCL, is it sufficient to just pass $OUTPUT_FILE to clCreateProgramWithBinary? Also, assuming this is enough. Is the code quality for recent AMD GPUs quality-wise on the level of what AMD&...
2019 Sep 09
2
LiveInterval error with 2 dead defs
....0: dead %0:vgpr_32 = V_MOV_B32_e32 0, implicit $exec dead %0:vgpr_32 = V_MOV_B32_e32 1, implicit $exec ... The live intervals look OK to me with 1 valno per instruction, for the life of the instruction like I would expect. The verifier does not like it however: $ llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx906 -verify-machineinstrs -run-pass=machine-scheduler -o - -verify-misched foo.mir # Before machine scheduling. ********** INTERVALS ********** %0 [16r,16d:1)[32r,32d:0) 0 at 32r 1 at 16r weight:0.000000e+00 RegMasks: ********** MACHINEINSTRS ********** # Machine code for function multip...
2015 Sep 29
2
OpenCL toolset (for AMD GPU)
Hi LLVM, I would like to compile OpenCL kernel for a specific AMD GPU target. Is it possible with the current clang/LLVM? I started by using `clang -x cl` but it looks like at least some OpenCL specific headers are missing (e.g. uint2 is not recognized as a type). Any links to documentation / tutorials very welcome. Thanks. - Paweł -------------- next part -------------- An HTML attachment was
2016 Mar 05
2
[AMDGPU] non-hsa intrinsic with hsa target
...at addrspace(1)* %arrayidx, align 4, !tbaa !8 ret void } which cannot be handled by llc with the message "the non-hsa instrinsic with hsa target shown". After looking into the log (r259297), my question is that is there other intrinsic that support this case when the target is amdgcn--amdhsa? In the log of r259297, it states that AMDGPUPromoteAlloca pass (a backend pass) will generate this intrinsic, but even when I just emit-llvm without going through llc, this intrinsic is still emitted. [1] https://github.com/tstellarAMD/hsa-runtime Regards, 李弘宇 (Li, Hong-Yu) Department of Comp...
2015 Sep 03
6
Testing "normal" cross-compilers versus GPU backends
...value. "ninja check" worked fine (without Mehdi's series of commits), so the normal kind of cross-compiler environment seems to be happy with how things were set up originally. Mehdi reports building LLVM with the X86 and AMDGPU backends, setting the default triple to "amdgcn--amdhsa", and getting 200-some failures. (This does make me wonder about AMDGPU testing in general; how does that work? The only places I see lit checks for AMDGPU are in the usual target-dependent places.) Mehdi's solution was: - In lit.cfg, change the existing "native" feature defi...
2015 Sep 03
2
Testing "normal" cross-compilers versus GPU backends
...hout Mehdi's series of commits), so the > > normal kind of cross-compiler environment seems to be happy with how > > things were set up originally. > > > > Mehdi reports building LLVM with the X86 and AMDGPU backends, setting > > the default triple to "amdgcn--amdhsa", and getting 200-some failures. > > > > (This does make me wonder about AMDGPU testing in general; how does that > > work? The only places I see lit checks for AMDGPU are in the usual > > target-dependent places.) > > I don’t understand this interrogation about...
2016 Mar 05
2
[AMDGPU] non-hsa intrinsic with hsa target
...> } >> >> which cannot be handled by llc with the message "the non-hsa instrinsic >> with hsa target shown". >> >> After looking into the log (r259297), my question is that is there other >> intrinsic that support this case when the target is amdgcn--amdhsa? In the >> log of r259297, it states that AMDGPUPromoteAlloca pass (a backend pass) >> will generate this intrinsic, but even when I just emit-llvm without going >> through llc, this intrinsic is still emitted. >> >> [1] https://github.com/tstellarAMD/hsa-runtime >...
2015 Sep 03
3
Testing "normal" cross-compilers versus GPU backends
...he > >>> normal kind of cross-compiler environment seems to be happy with how > >>> things were set up originally. > >>> > >>> Mehdi reports building LLVM with the X86 and AMDGPU backends, setting > >>> the default triple to "amdgcn--amdhsa", and getting 200-some failures. > >>> > >>> (This does make me wonder about AMDGPU testing in general; how does that > >>> work? The only places I see lit checks for AMDGPU are in the usual > >>> target-dependent places.) > >> > &...
2019 Oct 07
2
LiveInterval error with 2 dead defs
....0: dead %0:vgpr_32 = V_MOV_B32_e32 0, implicit $exec dead %0:vgpr_32 = V_MOV_B32_e32 1, implicit $exec ... The live intervals look OK to me with 1 valno per instruction, for the life of the instruction like I would expect. The verifier does not like it however: $ llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx906 -verify-machineinstrs -run-pass=machine-scheduler -o - -verify-misched foo.mir # Before machine scheduling. ********** INTERVALS ********** %0 [16r,16d:1)[32r,32d:0) 0 at 32r 1 at 16r weight:0.000000e+00 RegMasks: ********** MACHINEINSTRS ********** # Machine code for function multip...
2015 Sep 03
3
Testing "normal" cross-compilers versus GPU backends
...ross-compiler environment seems to be happy with how >>>>>> things were set up originally. >>>>>> >>>>>> Mehdi reports building LLVM with the X86 and AMDGPU backends, >> setting >>>>>> the default triple to "amdgcn--amdhsa", and getting 200-some >> failures. >>>>>> >>>>>> (This does make me wonder about AMDGPU testing in general; how does >> that >>>>>> work? The only places I see lit checks for AMDGPU are in the usual >>>>>> t...
2017 Mar 02
5
Structurizing multi-exit regions
...part -------------- ; RUN: opt -S -structurizecfg -si-annotate-control-flow %s target datalayout = "e-p:32:32-p1:64:64-p2:64:64-p3:32:32-p4:64:64-p5:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64" target triple = "amdgcn-amd-amdhsa-opencl" ; Function Attrs: nounwind define amdgpu_kernel void @multi_divergent_region_exit(i32 addrspace(1)* nocapture %arg0, i32 addrspace(1)* nocapture %arg1, i32 addrspace(1)* nocapture %arg2) #0 { entry: %tmp = tail call i32 @llvm.amdgcn.workitem.id.x() #1 %tmp1 = add i32 0, %tmp %tm...
2017 Oct 01
2
load with alignment of 1 crashes from being unaligned
...DIEnumerator(name: "cnk", value: 20) !29 = !DIEnumerator(name: "bitrig", value: 21) !30 = !DIEnumerator(name: "aix", value: 22) !31 = !DIEnumerator(name: "cuda", value: 23) !32 = !DIEnumerator(name: "nvcl", value: 24) !33 = !DIEnumerator(name: "amdhsa", value: 25) !34 = !DIEnumerator(name: "ps4", value: 26) !35 = !DIEnumerator(name: "elfiamcu", value: 27) !36 = !DIEnumerator(name: "tvos", value: 28) !37 = !DIEnumerator(name: "watchos", value: 29) !38 = !DIEnumerator(name: "mesa3d", value: 30...
2015 Sep 04
4
Testing "normal" cross-compilers versus GPU backends
...>> how >>>>>>>> things were set up originally. >>>>>>>> >>>>>>>> Mehdi reports building LLVM with the X86 and AMDGPU backends, >>>> setting >>>>>>>> the default triple to "amdgcn--amdhsa", and getting 200-some >>>> failures. >>>>>>>> >>>>>>>> (This does make me wonder about AMDGPU testing in general; how does >>>> that >>>>>>>> work? The only places I see lit checks for AMDGPU are...