thr3ads.net - similar to: "[RFC] Adding thread group semantics to LangRef (motivated by GPUs)"

Displaying 20 results from an estimated 3000 matches similar to: "[RFC] Adding thread group semantics to LangRef (motivated by GPUs)"

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2019 Jan 24

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

I don't see how this would fix the continue vs. nested loop problem I explained earlier. That is, how would this prevent turning: for (...) { ballot(); if (... /* non-uniform */) continue; } into for (...) { do { ballot(); } while (... /* non-uniform */); } and vice versa? Note that there's no duplication going on here, and the single-threaded flow of control is

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2019 Jan 28

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

On Fri, Jan 25, 2019 at 3:05 AM Jan Sjodin <jan_sjodin at yahoo.com> wrote: > > > for (...) { > > ballot(); > > if (... /* non-uniform */) continue; > > } > > > > into > > > > for (...) { > > do { > > ballot(); > > } while (... /* non-uniform */); > > } > > I'm not sure if I follow

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2018 Dec 29

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

On 20.12.18 18:03, Connor Abbott wrote: > We already have the notion of "convergent" functions like > syncthreads(), to which we cannot add control-flow dependencies. > That is, it's legal to hoist syncthreads out of an "if", but it's > not legal to sink it into an "if". It's not clear to me why we > can't have

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2019 Jan 30

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

On Mon, Jan 28, 2019 at 9:09 PM Jan Sjodin <jan_sjodin at yahoo.com> wrote: > > > for (int i = 0; i < 2; i++) { > > foo = ballot(true); // ballot 1 > > > > if (threadID /* ID of the thread within a wavefront/warp */ % 2 == 0) continue; > > > > bar = ballot(true); // ballot 2 > > } > > > > versus: > > > > int i =

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2019 Jan 31

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

Strong agree with Mehdi, I am also not really sure what is the proposal at this point so it's hard to comment further. > There are a number of questions that I have. Do we need better machine descriptions so that various resources can be considered? Do we need the capability to reason about the machine state for the cross-lane operations to enable more optimizations? Are intrinsics the

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2019 Jan 31

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

On Wed, Jan 30, 2019 at 7:20 AM Jan Sjodin via llvm-dev < llvm-dev at lists.llvm.org> wrote: > > > > > > > for (int i = 0; i < 2; i++) { > > > > foo = ballot(true); // ballot 1 > > > > > > > > if (threadID /* ID of the thread within a wavefront/warp */ % 2 > == 0) continue; > > > > > > > > bar =

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2019 Jan 30

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

On Wed, Jan 30, 2019 at 4:20 PM Jan Sjodin <jan_sjodin at yahoo.com> wrote: > > > > > > > for (int i = 0; i < 2; i++) { > > > > foo = ballot(true); // ballot 1 > > > > > > > > if (threadID /* ID of the thread within a wavefront/warp */ % 2 > == 0) continue; > > > > > > > > bar = ballot(true); //

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2019 Feb 01

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

On 31.01.19 15:59, Jan Sjodin wrote: >> > Any transform that re-arranges control flow would potentially have to >> > know about the properties of ballot(), and the rules with respect to >> > the CFG (and maybe consider the target) to know where to insert the >> > intrinsics. > >> But the same is true for basically any approach to handling this. In

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2019 Feb 09

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

On Sat, Feb 9, 2019 at 4:44 PM Jan Sjodin <jan_sjodin at yahoo.com> wrote: > > The reason I'm looking for solutions that can work without "scanning the > > code" or "spooky action at a distance" is that we should have a solution > > that's easily digestible by folks who are not aware of GPU execution > models. > > > > The fallback

[LLVMdev] memory scopes in atomic instructions

2014 Nov 14

[LLVMdev] memory scopes in atomic instructions

[LLVMdev] [PATCH][RFC] HSAIL Target

2015 May 13

[LLVMdev] [PATCH][RFC] HSAIL Target

Hi, AMD would like to propose including an LLVM backend for the HSAIL target. Patches for review are attached and can also be found at https://github.com/HSAFoundation/HLC-HSAIL-Development-LLVM/ on the hsail-review branch. Most of the recent work is visible on the hsail-1.0f branch, which is based on an LLVM commit approximately 1 month before 3.6 branched. The hsail-review branch is the

[LLVMdev] [RFC] Upstreaming LLVM/SPIR-V converter

2015 May 15

[LLVMdev] [RFC] Upstreaming LLVM/SPIR-V converter

+1 to lib/Target/SPIRV/(Reader|Writer) I really like this idea. I’ve talked with some people on both the LLVM and Khronos sides and I really think adding SPIR-V support to LLVM as an optional program serialization format would be fantastic. I think it would make it even easier for LLVM-based tools to be integrated into GPU authoring and execution pipelines. I’m really excited to see this moving

[LLVMdev] [PATCH][RFC] HSAIL Target

2015 Jul 01

[LLVMdev] [PATCH][RFC] HSAIL Target

> On Jun 22, 2015, at 9:31 AM, Rafael Espíndola <rafael.espindola at gmail.com> wrote: > > This part is scary. > > Having a third party library dependency is very undesirable from a testing perspective. > > I agree, but it’s what we are stuck with for now. It’s an optional dependency now, so most people building LLVM won’t need to worry about it > > One of

[LLVMdev] memory scopes in atomic instructions

2014 Nov 14

[LLVMdev] memory scopes in atomic instructions

On 11/15/2014 12:08 AM, Tom Stellard wrote: > Can you send a plain-text version of this email. It's easier to read > and reply to. Sorry about that! Here's the plain text (I hope!): Hi all, OpenCL 2.0 introduced the notion of memory scope in atomic operations to global memory. These scopes are a hint to the underlying platform to optimize how synchronization is achieved. HSAIL

[LLVMdev] [RFC] Upstreaming LLVM/SPIR-V converter

2015 May 15

[LLVMdev] [RFC] Upstreaming LLVM/SPIR-V converter

On Fri, May 15, 2015 at 11:50 AM, David Chisnall < David.Chisnall at cl.cam.ac.uk> wrote: > On 15 May 2015, at 17:53, Chris Bieneman <beanz at apple.com> wrote: > > > > +1 to lib/Target/SPIRV/(Reader|Writer) > > > > I really like this idea. I’ve talked with some people on both the LLVM > and Khronos sides and I really think adding SPIR-V support to LLVM

US LLVM Dev Meeting 2019 - Round Table - Challenges using LLVM for GPU compilation

2019 Oct 18

US LLVM Dev Meeting 2019 - Round Table - Challenges using LLVM for GPU compilation

Dear all, I would like announce a round table planned for the upcoming LLVM Dev meeting next week that will cover various topics related to the use of LLVM in the compiler stacks for the GPUs. Here is the initial list of discussion topics: - Canonicalization vs. GPUs: Type mutation; - Control flow mutation (graphics shaders are more sensitive to this); - Divergence/reconvergence sensitivity;

[LLVMdev] RFC: Convergent attribute

2015 May 13

[LLVMdev] RFC: Convergent attribute

Below is a proposal for a new "convergent" intrinsic attribute and MachineInstr property, needed for correctly modeling many SPMD/SIMT programming models in LLVM. Comments and feedback welcome. —Owen In order to make LLVM more suitable for programming models variously called SPMD and SIMT, we would like to propose a new intrinsic and MachineInstr annotation called

US LLVM Dev Meeting 2019 - Round Table - Challenges using LLVM for GPU compilation

2019 Oct 18

US LLVM Dev Meeting 2019 - Round Table - Challenges using LLVM for GPU compilation

Thanks, Marco! If there is enough interest in this topic we can also organize a separate round table for this discussion. Cheers, Anastasia ________________________________ From: Marco Antognini <Marco.Antognini at arm.com> Sent: 18 October 2019 14:42 To: Anastasia Stulova <Anastasia.Stulova at arm.com>; Simone Atzeni via llvm-dev <llvm-dev at lists.llvm.org>; clang developer

[LLVMdev] RFC: Convergent attribute

2015 Aug 14

[LLVMdev] RFC: Convergent attribute

Hi Jingyue, Convergent is not intended to prevent inlining. It’s tricky to formalize this inter-procedurally, but the intended interpretation is that a convergent operation cannot be move either into or out of a conditionally executed region. Normal inlining would not violate that. I would imagine that it would make sense to use a combination of convergent and noduplicate for barrier-like

[LLVMdev] RFC: Convergent attribute

2015 Aug 14

[LLVMdev] RFC: Convergent attribute

Hi Mehdi, My reading of it is that if you have a convergent instruction A, it is legal to duplicate it to instruction B if (assuming B is after A in program flow) A dominates B and B post-dominates A. James On Fri, 14 Aug 2015 at 08:32 Mehdi Amini via llvm-dev < llvm-dev at lists.llvm.org> wrote: > On Aug 13, 2015, at 9:43 PM, Owen Anderson via llvm-dev < > llvm-dev at

similar to: [RFC] Adding thread group semantics to LangRef (motivated by GPUs)