thr3ads.net - similar to: "[RFC] A New Divergence Analysis for LLVM"

Displaying 20 results from an estimated 2000 matches similar to: "[RFC] A New Divergence Analysis for LLVM"

Fwd: bugpoint can't automatically select a safe interpreter!

2019 Jul 22

Fwd: bugpoint can't automatically select a safe interpreter!

I tried to reduce the test case in https://bugs.llvm.org/show_bug.cgi?id=42706. Here it is crashing opt: $ ~/llvm-debug/bin/opt -use-gpu-divergence-analysis -divergence stripped.ll WARNING: You're attempting to print out a bitcode file. This is inadvisable as it may cause display problems. If you REALLY want to taste LLVM bitcode first-hand, you can force output with the `-f' option.

[SPIR/PTX] Divergence analysis for BasicBlocks

2017 Jul 14

[SPIR/PTX] Divergence analysis for BasicBlocks

Hello, It seems to me that our current DivergenceAnalysis does not save which BasicBlocks may suffer from divergent control. Am I correct? I want to modify our DivergenceAnalysis to add a "bool isControlDivergent(BasicBlock*) const" method and save in the divergence propagator the basicblock that are divergent. I am not sure that is entirely correct, if you have input on that please

[SPIR/PTX] Divergence analysis for BasicBlocks

2017 Jul 21

[SPIR/PTX] Divergence analysis for BasicBlocks

Hello, Yes? Where is allActive defined, I couldn't find it. Basically, a BB is control divergent if it's execution depends on a branch that itself depends on a divergent ssa value. On Fri, Jul 21, 2017 at 4:13 PM, Zaks, Ayal <ayal.zaks at intel.com> wrote: > What would be the definition of “isControlDivergent(BasicBlock*)”; the > complementary of “allActive(BasicBlock*)” –

[RFC][LV][VPlan] Proposal for Outer Loop Vectorization Implementation Plan

2018 Jan 16

[RFC][LV][VPlan] Proposal for Outer Loop Vectorization Implementation Plan

On 01/15/2018 03:52 PM, Philip Reames wrote: > To revive the discussion around vectorizer testing, here's a quick > sample of a few of the issues hit recently in the loop vectorizer. I > want to be careful to say that I am not stating these are the result > of any recent work, just that they're issues that have been triaged > down to the loop vectorizer doing something

[RFC] Upstreaming PACXX (Programing Accelerators with C++)

2018 Feb 05

[RFC] Upstreaming PACXX (Programing Accelerators with C++)

I was going to say, this reminds me of Kai's presentation at Fosdem yesterday. https://fosdem.org/2018/schedule/event/heterogenousd/ It's always good to see the cross-architecture power of LLVM being used in creative ways! :) cheers, --renato On 5 February 2018 at 13:35, Nicholas Wilson via llvm-dev <llvm-dev at lists.llvm.org> wrote: > Interesting. > > I do something

[RFC][LV][VPlan] Proposal for Outer Loop Vectorization Implementation Plan

2018 Jan 15

[RFC][LV][VPlan] Proposal for Outer Loop Vectorization Implementation Plan

To revive the discussion around vectorizer testing, here's a quick sample of a few of the issues hit recently in the loop vectorizer. I want to be careful to say that I am not stating these are the result of any recent work, just that they're issues that have been triaged down to the loop vectorizer doing something incorrect or questionable from a performance perspective.

[RFC][LV][VPlan] Proposal for Outer Loop Vectorization Implementation Plan

2017 Dec 06

[RFC][LV][VPlan] Proposal for Outer Loop Vectorization Implementation Plan

Proposal for Outer Loop Vectorization Implementation Plan ============================================= ===== Goal: ===== Extending Loop Vectorizer (LV) such that it can handle outer loops, via VPlan infrastructure enhancements. Understand the trade-offs in trying to make concurrent progress with moving remaining inner loop vectorization functionality to VPlan infrastructure ===========

[RFC] Upstreaming PACXX (Programing Accelerators with C++)

2018 Feb 05

[RFC] Upstreaming PACXX (Programing Accelerators with C++)

Interesting. I do something similar for D targeting CUDA (via NVPTX) and OpenCL (via my forward proved fork of Khronos’ SPIRV-LLVM)[1], except all the code generation is done at compile time. The runtime is aided by compile time reflection so that calling kernels is done by symbol. What kind of performance difference do you see running code that was not developed with GPU in mind (e.g.

[LV][VPlan] Status Update on VPlan ----- where we are currently, and what's ahead of us

2017 Dec 06

[LV][VPlan] Status Update on VPlan ----- where we are currently, and what's ahead of us

Status Update on VPlan ---- where we are currently, and what's ahead of us ========================================================== Goal: ----- Extending Loop Vectorizer (LV) such that it can handle outer loops, via uplifting its infrastructure with VPlan. The goal of this status update is to summarize the progress and the future steps needed. Background: ----------- This is related to

[RFC][LV][VPlan] Proposal for Outer Loop Vectorization Implementation Plan

2017 Dec 14

[RFC][LV][VPlan] Proposal for Outer Loop Vectorization Implementation Plan

>Another might be to introduce changes under feature flags to ease the revert/reintroduce/revert cycle. This is essentially the first guard. We plan to have flags/settings to control which types of outer loops are handled. The new code path is initially exclusive to outer loop vectorization. If we disable all types of outer loops (and that's the initial default), LV continues to be good

[RFC] Upstreaming PACXX (Programing Accelerators with C++)

2018 Feb 05

[RFC] Upstreaming PACXX (Programing Accelerators with C++)

HI LLVM comunity, after 3 years of development, various talks on LLVM-HPC and EuroLLVM and other scientific conferences I want to present my PhD research topic to the lists. The main goal for my research was to develop a single-source programming model equal to CUDA or SYCL for accelerators supported by LLVM (e.g., Nvidia GPUs). PACXX uses Clang as front-end for code generation and comes with

RFC: (Co-)Convergent functions and uniform function parameters

2016 Oct 31

RFC: (Co-)Convergent functions and uniform function parameters

(I work on CUDA / PTX.) For one thing I'm in favor of having fewer annotations rather than more, so if we can do this in a reasonable way without introducing the notion of co-convergent calls, I think that would be a win. The one convergent annotation is difficult enough for the GPU folks to grok and then keep in cache, and everyone who works on llvm has to pay the cost of keeping their

[LLVMdev] [cfe-dev] Proposal: pragma for branch divergence

2015 Jan 25

[LLVMdev] [cfe-dev] Proposal: pragma for branch divergence

Hi Owen and Vinod, Thanks for sharing the paper! I like the idea a lot. Regarding the paper itself, Vinod, are the consensual branches (e.g., cbranch.ifnone) you mentioned in the paper publicly available in PTX ISA? Owen, could you explain more on the approach of using branch-if-none instructions in your mind? I believe you have lots of great insights, but I don't see how cbranch.ifnone

MemorySSA question

2017 Dec 19

MemorySSA question

On Tue, Dec 19, 2017 at 9:10 AM, Siddharth Bhat via llvm-dev < llvm-dev at lists.llvm.org> wrote: > I could be entirely wrong, but from my understanding of memorySSA, each > def defines an "abstract heap state" which has the coarsest possible > definition - any write will be modelled as a "new heap state". > This is true for def-def relationships, but

MemorySSA question

2017 Dec 19

MemorySSA question

Hi, I am new to MemorySSA and wanted to understand its capabilities. Hence I wrote the following program (test.c): int N; void test(int *restrict a, int *restrict b, int *restrict c, int *restrict d, int *restrict e) { int i; for (i = 0; i < N; i = i + 5) { a[i] = b[i] + c[i]; } for (i = 0; i < N - 5; i = i + 5) { e[i] = a[i] * d[i]; } } I compiled this program using

RFC: (Co-)Convergent functions and uniform function parameters

2016 Oct 26

RFC: (Co-)Convergent functions and uniform function parameters

On 25.10.2016 16:28, Nicolai Hähnle wrote: > But I fear that this path leads to eternal fuzziness. Let me try a > completely different approach to define what we need by augmenting the > semantics of IR with "divergence tokens". In addition to its usual > value, every IR value carries a "divergence set" of divergence tokens. > > The basic rule is: the

[LLVMdev] Proposal: pragma for branch divergence

2015 Jan 24

[LLVMdev] Proposal: pragma for branch divergence

*Hi, I am considering a language extension to Clang for optimizing GPU programs. This extension will allow the compiler to use different optimization strategies for divergent and non-divergent branches (to be explained below). We have observed significant performance gain by leveraging this proposed extension, so I want to discuss it here to see how the community likes/dislikes the idea. I will

[LLVMdev] [cfe-dev] Proposal: pragma for branch divergence

2015 Jan 24

[LLVMdev] [cfe-dev] Proposal: pragma for branch divergence

In our experience, as Owen also suggests, a pragma or a language extension can be avoided by a combination of static and dynamic analysis. We prefer this approach in our compiler ;) Regards, Vinod On Sat, Jan 24, 2015 at 12:09 AM, Owen Anderson <resistor at mac.com> wrote: > Hi Jingyue, > > Have you considered using dynamic uniformity checks? In my experience you > can

[RFC] Writing loop transformations on the right representation is more productive

2020 Jan 15

[RFC] Writing loop transformations on the right representation is more productive

Am So., 12. Jan. 2020 um 20:07 Uhr schrieb Chris Lattner < clattner at nondot.org>: > The central idea is to use a modifiable loop tree -- similar to > LoopInfo -- as the primary representation. LLVM-IR is converted to a > loop tree, then optimized and finally LLVM-IR is generated again for > subtrees that are considered profitable. This is not a new concept, it > has already

[AMDGPU] Strange results with different address spaces

2017 Dec 05

[AMDGPU] Strange results with different address spaces

> On Dec 5, 2017, at 13:53, Matt Arsenault <arsenm2 at gmail.com> wrote: > > > >> On Dec 5, 2017, at 02:51, Haidl, Michael via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> Hi dev list, >> >> I am currently exploring the integration of AMDGPU/ROCm into the PACXX project and observing some

similar to: [RFC] A New Divergence Analysis for LLVM