Displaying 20 results from an estimated 1000 matches similar to: "[OpenCL][AMDGPU] Using AMDGPU generated kernel code for OpenCL"
2015 Sep 29
2
OpenCL toolset (for AMD GPU)
On 09/29/2015 04:19 PM, Tom Stellard via llvm-dev wrote:
> On Tue, Sep 29, 2015 at 01:20:57PM +0000, Paweł Bylica via llvm-dev wrote:
>> Hi LLVM,
>>
>> I would like to compile OpenCL kernel for a specific AMD GPU target. Is it
>> possible with the current clang/LLVM?
>>
>> I started by using `clang -x cl` but it looks like at least some OpenCL
>>
2015 Sep 29
2
OpenCL toolset (for AMD GPU)
Hi LLVM,
I would like to compile OpenCL kernel for a specific AMD GPU target. Is it
possible with the current clang/LLVM?
I started by using `clang -x cl` but it looks like at least some OpenCL
specific headers are missing (e.g. uint2 is not recognized as a type).
Any links to documentation / tutorials very welcome. Thanks.
- Paweł
-------------- next part --------------
An HTML attachment was
2016 Mar 05
2
[AMDGPU] non-hsa intrinsic with hsa target
Dear Developers,
I compiled a OpenCL kernel before (on Nov. last year) like
__kernel void g(__global float* array)
{
array[get_global_id(0)] = 1;
}
with libclc, which would originally use the instrinsics like
llvm.r600.read.local.size.x().
I executed the generated object file with one version of the hsa-runtime
[1] provided by Mr. Stellard, when there was more than one workgroup, the
output
2016 Mar 05
2
[AMDGPU] non-hsa intrinsic with hsa target
Hi Mr. Liu,
Thanks for your quick reply.
I compiled the code with the libclc_trunk and linked the bitcode file under
$LIBCLC_DIR/built_libs/tahiti-amdgcn--.bc. After looking into the libclc,
it is currently using the new workitem intrinsics
(commit ba9858caa1e927a6fcc601e3466faa693835db5e). In the linked bitcode
($LIBCLC_DIR/built_libs/tahiti-amdgcn--.bc), it has the following code
segment,
2018 Sep 05
4
Can I control HSA config generated by AMDGPU backend?
Finally I kind of modified llvm to generate assembly that can run on AMDGPU
pro drivers. One problem is the performance of the code generated by llvm
is about 10% slower than amdgpu's online compiler. Anything I can tune the
performance up the performance of llvm?\
Thanks!
On Tue, Sep 4, 2018 at 9:23 AM 董昌道 <dongchangdao at gmail.com> wrote:
> I am writing a miner of crypto
2019 Sep 09
2
LiveInterval error with 2 dead defs
Hi,
I’m hitting a machine verifier error in a trivial testcase which I don’t understand. There are 2 dead defs of the same register:
---
name: multiple_connected_compnents_dead
tracksRegLiveness: true
body: |
bb.0:
dead %0:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
dead %0:vgpr_32 = V_MOV_B32_e32 1, implicit $exec
...
The live intervals look OK to me with 1 valno
2017 Jun 15
1
Implementing cross-thread reduction in the AMDGPU backend
I'm wondering about the focus on bound_cntl. Any cleared bit in the row_mask or bank_mask will also disable updating the result.
Brian
-----Original Message-----
From: Connor Abbott [mailto:cwabbott0 at gmail.com]
Sent: Wednesday, June 14, 2017 6:13 PM
To: tstellar at redhat.com
Cc: Matt Arsenault; llvm-dev at lists.llvm.org; Kolton, Sam; Sumner, Brian; Pykhtin, Valery
Subject: Re:
2017 Jun 15
2
Implementing cross-thread reduction in the AMDGPU backend
On 06/14/2017 05:05 PM, Connor Abbott wrote:
> On Tue, Jun 13, 2017 at 6:13 PM, Tom Stellard <tstellar at redhat.com> wrote:
>> On 06/13/2017 07:33 PM, Matt Arsenault wrote:
>>>
>>>> On Jun 12, 2017, at 17:23, Tom Stellard <tstellar at redhat.com <mailto:tstellar at redhat.com>> wrote:
>>>>
>>>> On 06/12/2017 08:03 PM, Connor
2015 Sep 03
6
Testing "normal" cross-compilers versus GPU backends
This note arose from http://reviews.llvm.org/D12506 but the reviewers
felt that we needed a broader audience, because the proposed patch
really didn't solve the entire problem and we had no better ideas.
Mehdi Amini needs to build LLVM with just a GPU backend, and still have
"ninja check" Just Work. Commits r243958-243960 tried to accomplish
that; however they are too big a hammer,
2017 Jun 13
2
Implementing cross-thread reduction in the AMDGPU backend
On 06/12/2017 08:03 PM, Connor Abbott wrote:
> On Mon, Jun 12, 2017 at 4:56 PM, Tom Stellard <tstellar at redhat.com> wrote:
>> On 06/12/2017 07:15 PM, Tom Stellard via llvm-dev wrote:
>>> cc some people who have worked on this.
>>>
>>> On 06/12/2017 05:58 PM, Connor Abbott via llvm-dev wrote:
>>>> Hi all,
>>>>
>>>>
2017 Jun 14
0
Implementing cross-thread reduction in the AMDGPU backend
Sorry about the formatting...
Anyway, I think there may be a misinterpretation of bound_cntl. My understanding is that:
0 => if the source is invalid or disabled, do not write a result
1 => if the source is invalid or disabled, use a 0 instead
So the problematic case is where bound_cntl is 0, not when it is 1.
-----Original Message-----
From: Tom Stellard [mailto:tstellar at redhat.com]
2017 Jun 12
2
Implementing cross-thread reduction in the AMDGPU backend
On 06/12/2017 07:15 PM, Tom Stellard via llvm-dev wrote:
> cc some people who have worked on this.
>
> On 06/12/2017 05:58 PM, Connor Abbott via llvm-dev wrote:
>> Hi all,
>>
>> I've been looking into how to implement the more advanced Shader Model
>> 6 reduction operations in radv (and obviously most of the work would
>> be useful for radeonsi too).
2017 Jun 14
5
Implementing cross-thread reduction in the AMDGPU backend
On 06/13/2017 07:33 PM, Matt Arsenault wrote:
>
>> On Jun 12, 2017, at 17:23, Tom Stellard <tstellar at redhat.com <mailto:tstellar at redhat.com>> wrote:
>>
>> On 06/12/2017 08:03 PM, Connor Abbott wrote:
>>> On Mon, Jun 12, 2017 at 4:56 PM, Tom Stellard <tstellar at redhat.com <mailto:tstellar at redhat.com>> wrote:
>>>> On
2017 Jun 12
4
Implementing cross-thread reduction in the AMDGPU backend
Hi all,
I've been looking into how to implement the more advanced Shader Model
6 reduction operations in radv (and obviously most of the work would
be useful for radeonsi too). They're explained in the spec for
GL_AMD_shader_ballot at
https://www.khronos.org/registry/OpenGL/extensions/AMD/AMD_shader_ballot.txt,
but I'll summarize them here. There are two types of operations:
2015 Jun 08
2
[LLVMdev] R600 -> AMDGPU rename coming on Friday
Hi,
I'm finally going to do the R600->AMDGPU rename this Friday. This is something
I originally proposed last July [1], but had to put off in order to avoid
creating really bad merge headaches for some users. The only change from
my original proposal is that I'll just keep the existing r600 and amdgcn
triples rather than adding a new one for amdgpu. If anyone has any
strong
2017 Dec 05
2
[AMDGPU] Strange results with different address spaces
> On Dec 5, 2017, at 13:53, Matt Arsenault <arsenm2 at gmail.com> wrote:
>
>
>
>> On Dec 5, 2017, at 02:51, Haidl, Michael via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>>
>> Hi dev list,
>>
>> I am currently exploring the integration of AMDGPU/ROCm into the PACXX project and observing some
2015 Sep 03
2
Testing "normal" cross-compilers versus GPU backends
> -----Original Message-----
> From: Mehdi Amini [mailto:mehdi.amini at apple.com]
> Sent: Wednesday, September 02, 2015 7:10 PM
> To: Robinson, Paul
> Cc: llvm-dev at lists.llvm.org; tom at stellard.net; NAKAMURA Takumi
> Subject: Re: Testing "normal" cross-compilers versus GPU backends
>
> Hi Paul,
>
> Thanks for the summary!
>
> > On Sep 2,
2017 Dec 06
2
[AMDGPU] Strange results with different address spaces
> On Dec 6, 2017, at 02:28, Haidl, Michael <michael.haidl at uni-muenster.de> wrote:
>
> The IR goes through a backend agnostic preparation phase that brings it into SSA from and changes the AS from 0 to 1.
This sounds possibly problematic to me. The IR should be created with the correct address space to begin with. Changing this in the middle sounds suspect.
> After this
2015 Oct 23
3
[AMDGPU] AMDGPUAsmParser fails to parse several instructions
Dear Developers,
I compile a OpenCL kernel, FFT, in AMDAPP SDK v2.5 using clang 3.8 + libclc
and assembling the code with lld (The LLVM linker). The assembly code
contains the following assembly codes (and lots of other similar format
assembly) that fails to be parsed by AMDGPUAsmParser. It seems to me that
both are valid instructions after looking at the SI instruction spec.
s_mov_b32 s0,
2015 Sep 03
3
Testing "normal" cross-compilers versus GPU backends
On Thu, Sep 03, 2015 at 02:07:54AM -0700, Mehdi Amini wrote:
>
> > On Sep 3, 2015, at 12:18 AM, Robinson, Paul <Paul_Robinson at playstation.sony.com> wrote:
> >
> >
> >
> >> -----Original Message-----
> >> From: Mehdi Amini [mailto:mehdi.amini at apple.com]
> >> Sent: Wednesday, September 02, 2015 7:10 PM
> >> To: Robinson,