similar to: [RFC] A New Divergence Analysis for LLVM

Displaying 20 results from an estimated 2000 matches similar to: "[RFC] A New Divergence Analysis for LLVM"

2019 Jul 22
3
Fwd: bugpoint can't automatically select a safe interpreter!
I tried to reduce the test case in https://bugs.llvm.org/show_bug.cgi?id=42706. Here it is crashing opt: $ ~/llvm-debug/bin/opt -use-gpu-divergence-analysis -divergence stripped.ll WARNING: You're attempting to print out a bitcode file. This is inadvisable as it may cause display problems. If you REALLY want to taste LLVM bitcode first-hand, you can force output with the `-f' option.
2017 Jul 14
2
[SPIR/PTX] Divergence analysis for BasicBlocks
Hello, It seems to me that our current DivergenceAnalysis does not save which BasicBlocks may suffer from divergent control. Am I correct? I want to modify our DivergenceAnalysis to add a "bool isControlDivergent(BasicBlock*) const" method and save in the divergence propagator the basicblock that are divergent. I am not sure that is entirely correct, if you have input on that please
2017 Jul 21
2
[SPIR/PTX] Divergence analysis for BasicBlocks
Hello, Yes? Where is allActive defined, I couldn't find it. Basically, a BB is control divergent if it's execution depends on a branch that itself depends on a divergent ssa value. On Fri, Jul 21, 2017 at 4:13 PM, Zaks, Ayal <ayal.zaks at intel.com> wrote: > What would be the definition of “isControlDivergent(BasicBlock*)”; the > complementary of “allActive(BasicBlock*)” –
2018 Jan 16
1
[RFC][LV][VPlan] Proposal for Outer Loop Vectorization Implementation Plan
On 01/15/2018 03:52 PM, Philip Reames wrote: > To revive the discussion around vectorizer testing, here's a quick > sample of a few of the issues hit recently in the loop vectorizer.  I > want to be careful to say that I am not stating these are the result > of any recent work, just that they're issues that have been triaged > down to the loop vectorizer doing something
2018 Feb 05
1
[RFC] Upstreaming PACXX (Programing Accelerators with C++)
I was going to say, this reminds me of Kai's presentation at Fosdem yesterday. https://fosdem.org/2018/schedule/event/heterogenousd/ It's always good to see the cross-architecture power of LLVM being used in creative ways! :) cheers, --renato On 5 February 2018 at 13:35, Nicholas Wilson via llvm-dev <llvm-dev at lists.llvm.org> wrote: > Interesting. > > I do something
2018 Jan 15
0
[RFC][LV][VPlan] Proposal for Outer Loop Vectorization Implementation Plan
To revive the discussion around vectorizer testing, here's a quick sample of a few of the issues hit recently in the loop vectorizer.  I want to be careful to say that I am not stating these are the result of any recent work, just that they're issues that have been triaged down to the loop vectorizer doing something incorrect or questionable from a performance perspective.
2017 Dec 06
3
[RFC][LV][VPlan] Proposal for Outer Loop Vectorization Implementation Plan
Proposal for Outer Loop Vectorization Implementation Plan ============================================= ===== Goal: ===== Extending Loop Vectorizer (LV) such that it can handle outer loops, via VPlan infrastructure enhancements. Understand the trade-offs in trying to make concurrent progress with moving remaining inner loop vectorization functionality to VPlan infrastructure   ===========
2018 Feb 05
0
[RFC] Upstreaming PACXX (Programing Accelerators with C++)
Interesting. I do something similar for D targeting CUDA (via NVPTX) and OpenCL (via my forward proved fork of Khronos’ SPIRV-LLVM)[1], except all the code generation is done at compile time. The runtime is aided by compile time reflection so that calling kernels is done by symbol. What kind of performance difference do you see running code that was not developed with GPU in mind (e.g.
2017 Dec 06
5
[LV][VPlan] Status Update on VPlan ----- where we are currently, and what's ahead of us
Status Update on VPlan ---- where we are currently, and what's ahead of us ==========================================================   Goal: ----- Extending Loop Vectorizer (LV) such that it can handle outer loops, via uplifting its infrastructure with VPlan. The goal of this status update is to summarize the progress and the future steps needed.   Background: ----------- This is related to
2017 Dec 14
3
[RFC][LV][VPlan] Proposal for Outer Loop Vectorization Implementation Plan
>Another might be to introduce changes under feature flags to ease the revert/reintroduce/revert cycle. This is essentially the first guard. We plan to have flags/settings to control which types of outer loops are handled. The new code path is initially exclusive to outer loop vectorization. If we disable all types of outer loops (and that's the initial default), LV continues to be good
2018 Feb 05
4
[RFC] Upstreaming PACXX (Programing Accelerators with C++)
HI LLVM comunity, after 3 years of development, various talks on LLVM-HPC and EuroLLVM and other scientific conferences I want to present my PhD research topic to the lists. The main goal for my research was to develop a single-source programming model equal to CUDA or SYCL for accelerators supported by LLVM (e.g., Nvidia GPUs). PACXX uses Clang as front-end for code generation and comes with
2016 Oct 31
0
RFC: (Co-)Convergent functions and uniform function parameters
(I work on CUDA / PTX.) For one thing I'm in favor of having fewer annotations rather than more, so if we can do this in a reasonable way without introducing the notion of co-convergent calls, I think that would be a win. The one convergent annotation is difficult enough for the GPU folks to grok and then keep in cache, and everyone who works on llvm has to pay the cost of keeping their
2015 Jan 25
2
[LLVMdev] [cfe-dev] Proposal: pragma for branch divergence
Hi Owen and Vinod, Thanks for sharing the paper! I like the idea a lot. Regarding the paper itself, Vinod, are the consensual branches (e.g., cbranch.ifnone) you mentioned in the paper publicly available in PTX ISA? Owen, could you explain more on the approach of using branch-if-none instructions in your mind? I believe you have lots of great insights, but I don't see how cbranch.ifnone
2017 Dec 19
2
MemorySSA question
On Tue, Dec 19, 2017 at 9:10 AM, Siddharth Bhat via llvm-dev < llvm-dev at lists.llvm.org> wrote: > I could be entirely wrong, but from my understanding of memorySSA, each > def defines an "abstract heap state" which has the coarsest possible > definition - any write will be modelled as a "new heap state". > This is true for def-def relationships, but
2017 Dec 19
4
MemorySSA question
Hi, I am new to MemorySSA and wanted to understand its capabilities. Hence I wrote the following program (test.c): int N; void test(int *restrict a, int *restrict b, int *restrict c, int *restrict d, int *restrict e) { int i; for (i = 0; i < N; i = i + 5) { a[i] = b[i] + c[i]; } for (i = 0; i < N - 5; i = i + 5) { e[i] = a[i] * d[i]; } } I compiled this program using
2016 Oct 26
3
RFC: (Co-)Convergent functions and uniform function parameters
On 25.10.2016 16:28, Nicolai Hähnle wrote: > But I fear that this path leads to eternal fuzziness. Let me try a > completely different approach to define what we need by augmenting the > semantics of IR with "divergence tokens". In addition to its usual > value, every IR value carries a "divergence set" of divergence tokens. > > The basic rule is: the
2015 Jan 24
2
[LLVMdev] Proposal: pragma for branch divergence
*Hi, I am considering a language extension to Clang for optimizing GPU programs. This extension will allow the compiler to use different optimization strategies for divergent and non-divergent branches (to be explained below). We have observed significant performance gain by leveraging this proposed extension, so I want to discuss it here to see how the community likes/dislikes the idea. I will
2015 Jan 24
2
[LLVMdev] [cfe-dev] Proposal: pragma for branch divergence
In our experience, as Owen also suggests, a pragma or a language extension can be avoided by a combination of static and dynamic analysis. We prefer this approach in our compiler ;) Regards, Vinod On Sat, Jan 24, 2015 at 12:09 AM, Owen Anderson <resistor at mac.com> wrote: > Hi Jingyue, > > Have you considered using dynamic uniformity checks? In my experience you > can
2020 Jan 15
3
[RFC] Writing loop transformations on the right representation is more productive
Am So., 12. Jan. 2020 um 20:07 Uhr schrieb Chris Lattner < clattner at nondot.org>: > The central idea is to use a modifiable loop tree -- similar to > LoopInfo -- as the primary representation. LLVM-IR is converted to a > loop tree, then optimized and finally LLVM-IR is generated again for > subtrees that are considered profitable. This is not a new concept, it > has already
2017 Dec 05
2
[AMDGPU] Strange results with different address spaces
> On Dec 5, 2017, at 13:53, Matt Arsenault <arsenm2 at gmail.com> wrote: > > > >> On Dec 5, 2017, at 02:51, Haidl, Michael via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> Hi dev list, >> >> I am currently exploring the integration of AMDGPU/ROCm into the PACXX project and observing some