thr3ads.net - search: "workitems"

Displaying 20 results from an estimated 33 matches for "workitems".

Did you mean: workitem

[AMDGPU] Strange results with different address spaces

2017 Dec 05

[AMDGPU] Strange results with different address spaces

> On Dec 5, 2017, at 13:53, Matt Arsenault <arsenm2 at gmail.com> wrote: > > > >> On Dec 5, 2017, at 02:51, Haidl, Michael via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> Hi dev list, >> >> I am currently exploring the integration of AMDGPU/ROCm into the PACXX project and observing some

[AMDGPU] non-hsa intrinsic with hsa target

2016 Mar 05

[AMDGPU] non-hsa intrinsic with hsa target

Hi Mr. Liu, Thanks for your quick reply. I compiled the code with the libclc_trunk and linked the bitcode file under $LIBCLC_DIR/built_libs/tahiti-amdgcn--.bc. After looking into the libclc, it is currently using the new workitem intrinsics (commit ba9858caa1e927a6fcc601e3466faa693835db5e). In the linked bitcode ($LIBCLC_DIR/built_libs/tahiti-amdgcn--.bc), it has the following code segment,

[LLVMdev] [RFC] "noclone" function attribute

2012 Dec 01

[LLVMdev] [RFC] "noclone" function attribute

Hi, OpenCL has a "barrier" function with very specific semantics, and there is currently no analogue to model this in LLVM. This has been touched on by the SPIR folks but I don't believe they put forward a proposal. The barrier function is a special function that ensures that all workitems executing a kernel have executed up to that point before execution on any workitem can continue. The CL spec is specific about how user kernels can use barriers - the sequence of barriers that are hit by all workitems in a workgroup must be identical. An issue occurs when defining what "the s...

[LLVMdev] Loads moving across barriers

2013 Nov 09

[LLVMdev] Loads moving across barriers

...el than other address spaces? While it's less worrisome than the first interpretation, I still don't really like it. > This sounds right. With the constant address space, anything you do is OK since it’s constant. Private address space is supposed to be totally inaccessible from other workitems, so parallel modifications aren’t a concern. The others require explicit synchronization which noalias would need to be aware of.

[LLVMdev] [RFC] "noclone" function attribute

2012 Dec 02

[LLVMdev] [RFC] "noclone" function attribute

...Hi, OpenCL has a "barrier" function with very specific semantics, and there is currently no analogue to model this in LLVM. This has been touched on by the SPIR folks but I don't believe they put forward a proposal. The barrier function is a special function that ensures that all workitems executing a kernel have executed up to that point before execution on any workitem can continue. The CL spec is specific about how user kernels can use barriers - the sequence of barriers that are hit by all workitems in a workgroup must be identical. An issue occurs when defining what "the s...

[AMDGPU] Strange results with different address spaces

2017 Dec 05

[AMDGPU] Strange results with different address spaces

Hi dev list, I am currently exploring the integration of AMDGPU/ROCm into the PACXX project and observing some strange behavior of the AMDGPU backend. The following IR is generated for a simple address space test that copies from global to shared memory and back to global after a barrier synchronization. Here is the IR is attached as as1.ll The output is as follows: 0 0 0 0 0 0 0 0 0 0 0 0 0

[AMDGPU] non-hsa intrinsic with hsa target

2016 Mar 05

[AMDGPU] non-hsa intrinsic with hsa target

Dear Developers, I compiled a OpenCL kernel before (on Nov. last year) like __kernel void g(__global float* array) { array[get_global_id(0)] = 1; } with libclc, which would originally use the instrinsics like llvm.r600.read.local.size.x(). I executed the generated object file with one version of the hsa-runtime [1] provided by Mr. Stellard, when there was more than one workgroup, the output

[LLVMdev] Loads moving across barriers

2013 Nov 11

[LLVMdev] Loads moving across barriers

...address spaces? While it's less worrisome than the first interpretation, I still don't really like it. >> > > This sounds right. With the constant address space, anything you do is OK since it’s constant. Private address space is supposed to be totally inaccessible from other workitems, so parallel modifications aren’t a concern. The others require explicit synchronization which noalias would need to be aware of. FWIW, it seems generally useful to me to have a nomemfence function attribute and intrinsic property. We should avoid memory optimization (and possibly other optimizati...

Are AMDGPU intrinsics available in LLVM IR ?

2020 Apr 13

Are AMDGPU intrinsics available in LLVM IR ?

Hi! I'm trying to figure out how to access the workgroup id from within the LLVM IR language when lowering with the AMDGPU backend. Looking at the 'llvm/include/llvm/IR/IntrinsicsAMDGPU.td' file there are intrinsics defined to access the workitem index (thread index), but this file lives in 'llvm/include':

[LLVMdev] Loads moving across barriers

2013 Dec 04

[LLVMdev] Loads moving across barriers

...address spaces? While it's less worrisome than the first interpretation, I still don't really like it. >>> >> This sounds right. With the constant address space, anything you do is OK since it’s constant. Private address space is supposed to be totally inaccessible from other workitems, so parallel modifications aren’t a concern. The others require explicit synchronization which noalias would need to be aware of. > FWIW, it seems generally useful to me to have a nomemfence function attribute and intrinsic property. We should avoid memory optimization (and possibly other optimi...

CompiledCode execution + using scope + local variables in a loop => NullReferenceException

2011 Aug 02

CompiledCode execution + using scope + local variables in a loop => NullReferenceException

Hi, I have reported a bug on codeplex: http://ironruby.codeplex.com/workitem/6353 Will it be fixed in next release? Thank you, Anton -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://rubyforge.org/pipermail/ironruby-core/attachments/20110802/78a24916/attachment.html>

[LLVMdev] Loads moving across barriers

2013 Dec 05

[LLVMdev] Loads moving across barriers

...While it's less worrisome than the first interpretation, I still don't really like it. >>>>> >>>> This sounds right. With the constant address space, anything you do is OK since it’s constant. Private address space is supposed to be totally inaccessible from other workitems, so parallel modifications aren’t a concern. The others require explicit synchronization which noalias would need to be aware of. >>> FWIW, it seems generally useful to me to have a nomemfence function attribute and intrinsic property. We should avoid memory optimization (and possibly othe...

[LLVMdev] Loads moving across barriers

2013 Dec 05

[LLVMdev] Loads moving across barriers

...paces? While it's less worrisome than the first interpretation, I still don't really like it. >>>> >>> This sounds right. With the constant address space, anything you do is OK since it’s constant. Private address space is supposed to be totally inaccessible from other workitems, so parallel modifications aren’t a concern. The others require explicit synchronization which noalias would need to be aware of. >> FWIW, it seems generally useful to me to have a nomemfence function attribute and intrinsic property. We should avoid memory optimization (and possibly other op...

Memory scope proposal

2016 Jan 28

Memory scope proposal

Hi all, Currently, the LLVM IR uses a binary value (SingleThread/CrossThread) to represent synchronization scope on atomic instructions. We would like to enhance the representation of memory scopes in LLVM IR to allow more values than just the current two. The intention of this email is to invite comments on our proposal. There are some discussion before and it can be found here:

[LLVMdev] Loads moving across barriers

2013 Dec 05

[LLVMdev] Loads moving across barriers

...#39;s less worrisome than the first interpretation, I still don't really like it. >>>>>> >>>>> This sounds right. With the constant address space, anything you do is OK since it’s constant. Private address space is supposed to be totally inaccessible from other workitems, so parallel modifications aren’t a concern. The others require explicit synchronization which noalias would need to be aware of. >>>> FWIW, it seems generally useful to me to have a nomemfence function attribute and intrinsic property. We should avoid memory optimization (and possibly...

[AMDGPU] Strange results with different address spaces

2017 Dec 06

[AMDGPU] Strange results with different address spaces

> On Dec 6, 2017, at 02:28, Haidl, Michael <michael.haidl at uni-muenster.de> wrote: > > The IR goes through a backend agnostic preparation phase that brings it into SSA from and changes the AS from 0 to 1. This sounds possibly problematic to me. The IR should be created with the correct address space to begin with. Changing this in the middle sounds suspect. > After this

Memory scope proposal

2016 Mar 22

Memory scope proposal

Dear all, Here is the plain text version of the proposal: Currently, the LLVM IR uses a binary value (SingleThread/CrossThread) to represent synchronization scope on atomic instructions. We would like to enhance the representation of memory scopes in LLVM IR to allow more values than just the current two. The intention of this email is to invite comments on our proposal. There are some

Memory scope proposal

2016 Mar 29

Memory scope proposal

Ke, I'll be the bearer of bad news here. The radio silence this proposal has gotten probably means there is not enough interest in the community in this proposal to see it land. One concern I have with the current proposal is that the optimization value of these scopes is not clear to me. Is it only the backend which is expected to support optimizations over these scopes? Or are you

Dealing with illegal operand mappings in RegBankSelect

2019 Feb 26

Dealing with illegal operand mappings in RegBankSelect

> On Feb 21, 2019, at 12:18 AM, Quentin Colombet via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hi Matt, > >> On Feb 20, 2019, at 4:49 PM, Arsenault, Matthew via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> Hi, >> >> Some operations on AMDGPU require operands which must be in a register

[LLVMdev] Loads moving across barriers

2013 Dec 21

[LLVMdev] Loads moving across barriers

...ss worrisome than the first interpretation, I still don't really like it. >>>>>>> >>>>>> This sounds right. With the constant address space, anything you do is OK since it’s constant. Private address space is supposed to be totally inaccessible from other workitems, so parallel modifications aren’t a concern. The others require explicit synchronization which noalias would need to be aware of. >>>>> FWIW, it seems generally useful to me to have a nomemfence function attribute and intrinsic property. We should avoid memory optimization (and possi...

search for: workitems