thr3ads.net - similar to: "[LLVMdev] GSoC Proposal: Profiling Enhancements"

Displaying 20 results from an estimated 20000 matches similar to: "[LLVMdev] GSoC Proposal: Profiling Enhancements"

[LLVMdev] RFC: Profiling Enhancements (GSoC)

2012 Jul 16

[LLVMdev] RFC: Profiling Enhancements (GSoC)

Hi all, In light of the expected removal of ProfileInfo this is a request for comments on the next few items that I now plan to work on for GSoC. Planned tasks: #0 Add support for determining branch weight metadata by profiling At the absolute minimum this will require writing a new profile loader which will set branch weight metadata based on profiling data. #1 Optionally use profiling

[RFC] Refinement of convergent semantics

2015 Sep 22

[RFC] Refinement of convergent semantics

Hi Jingyue, I consider it a very important element of the design of convergent that it does not require baseline LLVM to contain a definition of uniformity, which would itself pull in a definition of SIMT/SPMD, warps, threads, etc. The intention is that it should be a conservative (but hopefully not too conservative) approximation, and that implementations of specific GPU programming models

[RFC] Refinement of convergent semantics

2015 Sep 04

[RFC] Refinement of convergent semantics

Hi all, In light of recent discussions regarding updating passes to respect convergent semantics, and whether or not it is sufficient for barriers, I would like to propose a change in convergent semantics that should resolve a lot of the identified problems regarding loop unrolling, loop unswitching, etc. Credit to John McCall for talking this over with me and seeding the core ideas. Today,

[RFC] Refinement of convergent semantics

2015 Sep 14

[RFC] Refinement of convergent semantics

> On Sep 14, 2015, at 12:15 PM, Philip Reames <listmail at philipreames.com> wrote: > > On 09/04/2015 01:25 PM, Owen Anderson via llvm-dev wrote: >> Hi all, >> >> In light of recent discussions regarding updating passes to respect convergent semantics, and whether or not it is sufficient for barriers, I would like to propose a change in convergent semantics that

[LLVMdev] RFC: Profiling Enhancements (GSoC)

2012 Jul 17

[LLVMdev] RFC: Profiling Enhancements (GSoC)

Hi Alastair, In addition to your planned tasks, you might want to put in some work to ensure branch probabilities are not lost during optimization. One known issue is LLVM optimizer can turn branchy code into switch statements and it would completely discard probability. Here is a simple example: static void func2(int N, const int *a, const int *b, int *c) __attribute__((always_inline)); void

[LLVMdev] IR Passes and TargetTransformInfo: Straw Man

2013 Jul 30

[LLVMdev] IR Passes and TargetTransformInfo: Straw Man

On 7/29/2013 6:28 PM, Andrew Trick wrote: > > You mean that LICM and Unswitching should be left for later? For the purpose of exposing scalar optimizations, I'm not sure I agree with that but I'd be interested in examples. Optimizations like LICM, and unswitching can potentially damage perfect nesting of loops. For example, consider this nest: for (i) { for (j) {

[LLVMdev] IR Passes and TargetTransformInfo: Straw Man

2013 Jul 29

[LLVMdev] IR Passes and TargetTransformInfo: Straw Man

On Jul 29, 2013, at 9:05 AM, Krzysztof Parzyszek <kparzysz at codeaurora.org> wrote: > On 7/16/2013 11:38 PM, Andrew Trick wrote: >> Since introducing the new TargetTransformInfo analysis, there has been some confusion over the role of target heuristics in IR passes. A few patches have led to interesting discussions. >> >> To centralize the discussion, until we get

Query on unswitching + vectorization

2018 May 11

Query on unswitching + vectorization

Hi, I am going through analysis on unswitching + vectorization. For the below test, llvm unswitches successfully but fails to vectorize the loop after unswitching. Llvm bails out saying "Found an outside user" apparently which is the value of 'tmp'. int i, w, x[1000], y[1000],tmp; void fn() { for (i = 0; i < 1000; i++) { if (w==1) { y[i] = 1; tmp = i*2; }

Query on unswitching + vectorization

2018 May 11

Query on unswitching + vectorization

On 5/10/2018 10:44 PM, Gopalasubramanian, Ganesh via llvm-dev wrote: > > Hi, > > I am going through analysis on unswitching + vectorization. > > For the below test, llvm unswitches successfully but fails to > vectorize the loop after unswitching. > > Llvm bails out saying “Found an outside user” apparently which is the > value of ‘tmp’. > > int i, w, x[1000],

RFC: Extending optimization reporting

2019 May 08

RFC: Extending optimization reporting

Hi Adam, Thanks for your input. If I understand correctly, you’re saying that we can handle the loop versioning issue by explicitly identifying new loops as they are created. So, the unswitching optimization, for example, would report that it unswitched loop-0 at source location X, creating loop-1 and loop-2, and then later the vectorizer would report that it was unable to vectorize loop-1 at

How to best deal with undesirable Induction Variable Simplification?

2019 Aug 08

How to best deal with undesirable Induction Variable Simplification?

Hello, Recently I've come across two instances where Induction Variable Simplification lead to noticable performance regressions. In one case, the removal of extra IV lead to the inability to reschedule instructions in a tight loop to reduce stalls. In that case, there were enough registers to spare, so using extra register for extra induction variable was preferable since it reduced

2017 Jul 17

A bug related with undef value when bootstrap MemorySSA.cpp

The issue blocks another optimization patch and Wei has spent huge amount of effort isolating the the bootstrap failure to this same problem. I agree with Wei that other developers may also get hit by the same issue and the cost of leaving this issue open for long can be very high to the community. David On Mon, Jul 17, 2017 at 10:01 AM, Wei Mi <wmi at google.com> wrote: > Sanjoy and

2017 Jul 17

A bug related with undef value when bootstrap MemorySSA.cpp

Cool, thanks for debugging this issue and letting us know. We have a few patches to fix this issue: - Introduce freeze in IR: https://reviews.llvm.org/D29011 - Lowering freeze: https://reviews.llvm.org/D29014 - Fix loop unswitch: https://reviews.llvm.org/D29015 Bonus patches to recover perf: - Be less conservative in loop unswitching: https://reviews.llvm.org/D29016 - Instcombine support

[LLVMdev] Improving loop vectorizer support for loops with a volatile iteration variable

2015 Jul 15

[LLVMdev] Improving loop vectorizer support for loops with a volatile iteration variable

Hi all, I would like to propose an improvement of the “almost dead” block elimination in Transforms/Local.cpp so that it will preserve the canonical loop form for loops with a volatile iteration variable. *** Problem statement Nested loops in LCALS Subset B (https://codesign.llnl.gov/LCALS.php) are not vectorized with LLVM -O3 because the LLVM loop vectorizer fails the test whether the loop

[LLVMdev] Instructions that cannot be duplicated

2009 Oct 09

[LLVMdev] Instructions that cannot be duplicated

Is inlining (which duplicates code) of functions containing OpenCL style barriers legal?or e.g. if you had some changed phase ordering where you had if (cond) { S1; } call user_func() // user_func has a barrier buried inside it. you do tail splitting if (cond) { S1; call user_func() } else { call user_func(); } now you inline -- oops now you might have a problem so do you want

How to best deal with undesirable Induction Variable Simplification?

2019 Aug 09

How to best deal with undesirable Induction Variable Simplification?

Hi Hal, I see. So LSR could theoretically counteract undesirable Ind Var transformations but it's not implemented at the moment? I think I've managed to come up with a small reproducer that can also exhibit similar problem on x86, here it is: https://godbolt.org/z/_wxzut As you can see, when rewriteLoopExitValues is not disabled Clang generates worse code due to additional spills,

FYI, planning to enable nontrivial loop unswitch in the new PM at O3

2018 Apr 29

FYI, planning to enable nontrivial loop unswitch in the new PM at O3

Is there any written description of what "non trivialness" is there? On Sun, Apr 29, 2018, 2:49 PM Chandler Carruth via llvm-dev < llvm-dev at lists.llvm.org> wrote: > One of the last big missing pieces for the new PM is enabling non-trivial > loop unswitch at O3. > > The pass is now working well and passing all the testing I have done as > well as some others'

GSoC Proposal : Path Profiling Support

2016 Mar 15

GSoC Proposal : Path Profiling Support

This proposal adds support for path profiling [Ball96] to LLVM. Path profiling compactly represents acyclic paths in a directed acyclic graph representation of the control flow graph of a routine. Instrumentation can be added to uniquely identify paths executed at runtime. Path profiles enable precise enumeration of the sequence of basic blocks executed in order for a particular path. Using path

[LLVMdev] Autotuning parameters/heuristics within LLVM

2014 Oct 02

[LLVMdev] Autotuning parameters/heuristics within LLVM

Hi, I am planning to begin a project to explore the space of tuning LLVM internals in an effort to increase performance. I am wondering if anyone can point to me any parameterizations, heuristics, or priorities functions within LLVM that can be tuned/adjusted. So far, I'm considering BranchProbabilityInfo and InlineCost. Does anyone have any other suggestions? Thanks, Robert

[LLVMdev] IR Passes and TargetTransformInfo: Straw Man

2013 Jul 29

[LLVMdev] IR Passes and TargetTransformInfo: Straw Man

On 7/16/2013 11:38 PM, Andrew Trick wrote: > Since introducing the new TargetTransformInfo analysis, there has been some confusion over the role of target heuristics in IR passes. A few patches have led to interesting discussions. > > To centralize the discussion, until we get some documentation and better APIs in place, let me throw out an oversimplified Straw Man for a new pass pipline.

similar to: [LLVMdev] GSoC Proposal: Profiling Enhancements