thr3ads.net - similar to: "[LLVMdev] RFC: Convergent attribute"

Displaying 20 results from an estimated 10000 matches similar to: "[LLVMdev] RFC: Convergent attribute"

2015 May 14

[LLVMdev] RFC: Convergent attribute

Why is this a regalloc problem? I assume in the example below the "r0" is somehow forced by the ABI? Because otherwise moving the texture2d operation into the branch wouldn't matter as long as we assign different registers to the two branches and use a technique like lib/Target/R600/SIFixSGPRLiveRanges.cpp. - Matthias > On May 13, 2015, at 6:00 PM, Philip Reames <listmail at

[LLVMdev] RFC: Convergent attribute

2015 Aug 14

[LLVMdev] RFC: Convergent attribute

Hi Jingyue, Convergent is not intended to prevent inlining. It’s tricky to formalize this inter-procedurally, but the intended interpretation is that a convergent operation cannot be move either into or out of a conditionally executed region. Normal inlining would not violate that. I would imagine that it would make sense to use a combination of convergent and noduplicate for barrier-like

[LLVMdev] RFC: Convergent attribute

2015 Aug 14

[LLVMdev] RFC: Convergent attribute

Hi Mehdi, My reading of it is that if you have a convergent instruction A, it is legal to duplicate it to instruction B if (assuming B is after A in program flow) A dominates B and B post-dominates A. James On Fri, 14 Aug 2015 at 08:32 Mehdi Amini via llvm-dev < llvm-dev at lists.llvm.org> wrote: > On Aug 13, 2015, at 9:43 PM, Owen Anderson via llvm-dev < > llvm-dev at

RFC: (Co-)Convergent functions and uniform function parameters

2016 Oct 24

RFC: (Co-)Convergent functions and uniform function parameters

Hi all, Some brain-storming on an issue with SPMD/SIMT backend support where I think some additional IR attributes would be useful. Sorry for the somewhat long mail; the short version of my current thinking is that I would like to have the following: 1) convergent: a call to a function with this attribute cannot be moved to have additional control dependencies; i.e., moving it from A to B is

RFC: (Co-)Convergent functions and uniform function parameters

2016 Oct 24

RFC: (Co-)Convergent functions and uniform function parameters

On 24.10.2016 21:54, Mehdi Amini wrote: >> On Oct 24, 2016, at 12:38 PM, Nicolai Hähnle via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> Some brain-storming on an issue with SPMD/SIMT backend support where I think some additional IR attributes would be useful. Sorry for the somewhat long mail; the short version of my current thinking is that I would like to have the following:

[RFC] Refinement of convergent semantics

2015 Sep 22

[RFC] Refinement of convergent semantics

Hi Jingyue, I consider it a very important element of the design of convergent that it does not require baseline LLVM to contain a definition of uniformity, which would itself pull in a definition of SIMT/SPMD, warps, threads, etc. The intention is that it should be a conservative (but hopefully not too conservative) approximation, and that implementations of specific GPU programming models

RFC: (Co-)Convergent functions and uniform function parameters

2016 Oct 24

RFC: (Co-)Convergent functions and uniform function parameters

> On Oct 24, 2016, at 4:15 PM, Nicolai Hähnle <nhaehnle at gmail.com> wrote: > > On 25.10.2016 01:11, Nicolai Hähnle wrote: >> On 24.10.2016 21:54, Mehdi Amini wrote: >>>> On Oct 24, 2016, at 12:38 PM, Nicolai Hähnle via llvm-dev >>>> <llvm-dev at lists.llvm.org> wrote: >>>> Some brain-storming on an issue with SPMD/SIMT backend

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

2018 Dec 19

[RFC] Adding thread group semantics to LangRef (motivated by GPUs)

Hi all, LLVM needs a solution to the long-standing problem that the IR is unable to express certain semantics expected by high-level programming languages that target GPUs. Solving this issue is necessary both for upstream use of LLVM as a compiler backend for GPUs and for correctly supporting LLVM IR <-> SPIR-V roundtrip translation. It may also be useful for compilers targeting

RFC: (Co-)Convergent functions and uniform function parameters

2016 Oct 26

RFC: (Co-)Convergent functions and uniform function parameters

On 25.10.2016 16:28, Nicolai Hähnle wrote: > But I fear that this path leads to eternal fuzziness. Let me try a > completely different approach to define what we need by augmenting the > semantics of IR with "divergence tokens". In addition to its usual > value, every IR value carries a "divergence set" of divergence tokens. > > The basic rule is: the

[RFC] Refinement of convergent semantics

2015 Sep 04

[RFC] Refinement of convergent semantics

Hi all, In light of recent discussions regarding updating passes to respect convergent semantics, and whether or not it is sufficient for barriers, I would like to propose a change in convergent semantics that should resolve a lot of the identified problems regarding loop unrolling, loop unswitching, etc. Credit to John McCall for talking this over with me and seeding the core ideas. Today,

Writing loop transformations on the right representation is more productive

2020 Jan 03

Writing loop transformations on the right representation is more productive

In the 2018 LLVM DevMtg [1], I presented some shortcomings of how LLVM optimizes loops. In summary, the biggest issues are (a) the complexity of writing a new loop optimization pass (including needing to deal with a variety of low-level issues, a significant amount of required boilerplate, the difficulty of analysis preservation, etc.), (b) independent optimization heuristics and a fixed pass

[LLVMdev] Changes to the PTX calling conventions

2011 Dec 14

[LLVMdev] Changes to the PTX calling conventions

Hi all, On 12/13/2011 10:50 PM, Justin Holewinski wrote: > You mean having no calling convention for device functions, and a new, common > calling convention for kernels? I think this might make sense. One major issue with OpenCL C (and I suppose CUDA) kernels some fail to see is that the functions are "directly callable" (just by choosing a correct the calling convention) in

[RFC] Refinement of convergent semantics

2015 Sep 14

[RFC] Refinement of convergent semantics

> On Sep 14, 2015, at 12:15 PM, Philip Reames <listmail at philipreames.com> wrote: > > On 09/04/2015 01:25 PM, Owen Anderson via llvm-dev wrote: >> Hi all, >> >> In light of recent discussions regarding updating passes to respect convergent semantics, and whether or not it is sufficient for barriers, I would like to propose a change in convergent semantics that

[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization

2013 Jan 25

[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization

On 01/25/2013 09:56 AM, Nadav Rotem wrote: > Thanks for checking the Loop Vectorizer, I am interested in hearing your > feedback. The Loop Vectorizer does not fit here. OpenCL vectorization is > completely different because the language itself is data-parallel. You > don't need all of the legality checks that the loop vectorizer has. I'm aware of this and it was my point in

[LLVMdev] Changes to the PTX calling conventions

2011 Dec 13

[LLVMdev] Changes to the PTX calling conventions

On Tue, Dec 13, 2011 at 3:37 PM, Villmow, Micah <Micah.Villmow at amd.com>wrote: > ** ** > > *From:* Justin Holewinski [mailto:justin.holewinski at gmail.com] > *Sent:* Tuesday, December 13, 2011 10:50 AM > > *To:* Villmow, Micah > *Cc:* LLVM Developers Mailing List > *Subject:* Re: [LLVMdev] Changes to the PTX calling conventions**** > > ** ** > > On

[LLVMdev] Changes to the PTX calling conventions

2011 Dec 13

[LLVMdev] Changes to the PTX calling conventions

From: Justin Holewinski [mailto:justin.holewinski at gmail.com] Sent: Tuesday, December 13, 2011 10:50 AM To: Villmow, Micah Cc: LLVM Developers Mailing List Subject: Re: [LLVMdev] Changes to the PTX calling conventions On Tue, Dec 13, 2011 at 12:54 PM, Villmow, Micah <Micah.Villmow at amd.com<mailto:Micah.Villmow at amd.com>> wrote: From: Justin Holewinski [mailto:justin.holewinski

extracting the t-statistic: just the numbers, please

2004 Jul 29

extracting the t-statistic: just the numbers, please

Hi, there I am quite sure there is an easy answer to this, but I am unsure how to gather a bunch of t-statistics in an organized format. I am trying to generate a list of t-statistics for a randomization routine. If I try to collect a bunch of t-statistics from a run, this is what happens: > M <- 10 ; simt <- NULL > for(i in 1:M) + { + perm<-sample(site,replace=F) + +

[Bug 68344] New: [piglit] shaders/glsl-fs-texture2d-dependent-4 randomly passes or fails on NVAA/NV50

2013 Aug 20

[Bug 68344] New: [piglit] shaders/glsl-fs-texture2d-dependent-4 randomly passes or fails on NVAA/NV50

https://bugs.freedesktop.org/show_bug.cgi?id=68344 Priority: medium Bug ID: 68344 Assignee: nouveau at lists.freedesktop.org Summary: [piglit] shaders/glsl-fs-texture2d-dependent-4 randomly passes or fails on NVAA/NV50 QA Contact: xorg-team at lists.x.org Severity: normal Classification: Unclassified

[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization

2013 Jan 25

[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization

Hi Pekka, > Hi, > > I started to play with the LoopVectorizer of LLVM trunk > on the work-item loops produced by pocl's OpenCL C > kernel compiler, in hopes of implementing multi-work-item > work group autovectorization in a modular manner. > Thanks for checking the Loop Vectorizer, I am interested in hearing your feedback. The Loop Vectorizer does not fit here.

[RFC] Late (OpenMP) GPU code "SPMD-zation"

2019 Jan 22

[RFC] Late (OpenMP) GPU code "SPMD-zation"

We would still know that. We can do exactly the same reasoning as we do now. I think the important question is, how different is the code generated for either mode and can we hide (most of) the differences in the runtime. If I understand you correctly, you say the data sharing code looks very different and the differences cannot be hidden, correct? It would be helpful for me to understand your

similar to: [LLVMdev] RFC: Convergent attribute