similar to: [LLVMdev] GSoC Proposal: Profiling Enhancements

Displaying 20 results from an estimated 20000 matches similar to: "[LLVMdev] GSoC Proposal: Profiling Enhancements"

2012 Jul 16
2
[LLVMdev] RFC: Profiling Enhancements (GSoC)
Hi all, In light of the expected removal of ProfileInfo this is a request for comments on the next few items that I now plan to work on for GSoC. Planned tasks: #0 Add support for determining branch weight metadata by profiling At the absolute minimum this will require writing a new profile loader which will set branch weight metadata based on profiling data. #1 Optionally use profiling
2015 Sep 22
2
[RFC] Refinement of convergent semantics
Hi Jingyue, I consider it a very important element of the design of convergent that it does not require baseline LLVM to contain a definition of uniformity, which would itself pull in a definition of SIMT/SPMD, warps, threads, etc. The intention is that it should be a conservative (but hopefully not too conservative) approximation, and that implementations of specific GPU programming models
2015 Sep 04
9
[RFC] Refinement of convergent semantics
Hi all, In light of recent discussions regarding updating passes to respect convergent semantics, and whether or not it is sufficient for barriers, I would like to propose a change in convergent semantics that should resolve a lot of the identified problems regarding loop unrolling, loop unswitching, etc. Credit to John McCall for talking this over with me and seeding the core ideas. Today,
2015 Sep 14
2
[RFC] Refinement of convergent semantics
> On Sep 14, 2015, at 12:15 PM, Philip Reames <listmail at philipreames.com> wrote: > > On 09/04/2015 01:25 PM, Owen Anderson via llvm-dev wrote: >> Hi all, >> >> In light of recent discussions regarding updating passes to respect convergent semantics, and whether or not it is sufficient for barriers, I would like to propose a change in convergent semantics that
2012 Jul 17
0
[LLVMdev] RFC: Profiling Enhancements (GSoC)
Hi Alastair, In addition to your planned tasks, you might want to put in some work to ensure branch probabilities are not lost during optimization. One known issue is LLVM optimizer can turn branchy code into switch statements and it would completely discard probability. Here is a simple example: static void func2(int N, const int *a, const int *b, int *c) __attribute__((always_inline)); void
2013 Jul 29
3
[LLVMdev] IR Passes and TargetTransformInfo: Straw Man
On Jul 29, 2013, at 9:05 AM, Krzysztof Parzyszek <kparzysz at codeaurora.org> wrote: > On 7/16/2013 11:38 PM, Andrew Trick wrote: >> Since introducing the new TargetTransformInfo analysis, there has been some confusion over the role of target heuristics in IR passes. A few patches have led to interesting discussions. >> >> To centralize the discussion, until we get
2013 Jul 30
0
[LLVMdev] IR Passes and TargetTransformInfo: Straw Man
On 7/29/2013 6:28 PM, Andrew Trick wrote: > > You mean that LICM and Unswitching should be left for later? For the purpose of exposing scalar optimizations, I'm not sure I agree with that but I'd be interested in examples. Optimizations like LICM, and unswitching can potentially damage perfect nesting of loops. For example, consider this nest: for (i) { for (j) {
2018 May 11
2
Query on unswitching + vectorization
Hi, I am going through analysis on unswitching + vectorization. For the below test, llvm unswitches successfully but fails to vectorize the loop after unswitching. Llvm bails out saying "Found an outside user" apparently which is the value of 'tmp'. int i, w, x[1000], y[1000],tmp; void fn() { for (i = 0; i < 1000; i++) { if (w==1) { y[i] = 1; tmp = i*2; }
2018 May 11
0
Query on unswitching + vectorization
On 5/10/2018 10:44 PM, Gopalasubramanian, Ganesh via llvm-dev wrote: > > Hi, > > I am going through analysis on unswitching + vectorization. > > For the below test, llvm unswitches successfully but fails to > vectorize the loop after unswitching. > > Llvm bails out saying “Found an outside user” apparently which is the > value of ‘tmp’. > > int i, w, x[1000],
2019 May 08
2
RFC: Extending optimization reporting
Hi Adam, Thanks for your input. If I understand correctly, you’re saying that we can handle the loop versioning issue by explicitly identifying new loops as they are created. So, the unswitching optimization, for example, would report that it unswitched loop-0 at source location X, creating loop-1 and loop-2, and then later the vectorizer would report that it was unable to vectorize loop-1 at
2019 Aug 08
3
How to best deal with undesirable Induction Variable Simplification?
Hello, Recently I've come across two instances where Induction Variable Simplification lead to noticable performance regressions. In one case, the removal of extra IV lead to the inability to reschedule instructions in a tight loop to reduce stalls. In that case, there were enough registers to spare, so using extra register for extra induction variable was preferable since it reduced
2014 Oct 02
2
[LLVMdev] Autotuning parameters/heuristics within LLVM
Hi, I am planning to begin a project to explore the space of tuning LLVM internals in an effort to increase performance. I am wondering if anyone can point to me any parameterizations, heuristics, or priorities functions within LLVM that can be tuned/adjusted. So far, I'm considering BranchProbabilityInfo and InlineCost. Does anyone have any other suggestions? Thanks, Robert
2017 Jul 17
2
A bug related with undef value when bootstrap MemorySSA.cpp
The issue blocks another optimization patch and Wei has spent huge amount of effort isolating the the bootstrap failure to this same problem. I agree with Wei that other developers may also get hit by the same issue and the cost of leaving this issue open for long can be very high to the community. David On Mon, Jul 17, 2017 at 10:01 AM, Wei Mi <wmi at google.com> wrote: > Sanjoy and
2017 Jul 17
2
A bug related with undef value when bootstrap MemorySSA.cpp
Cool, thanks for debugging this issue and letting us know. We have a few patches to fix this issue: - Introduce freeze in IR: https://reviews.llvm.org/D29011 - Lowering freeze: https://reviews.llvm.org/D29014 - Fix loop unswitch: https://reviews.llvm.org/D29015 Bonus patches to recover perf: - Be less conservative in loop unswitching: https://reviews.llvm.org/D29016 - Instcombine support
2013 Jul 29
0
[LLVMdev] IR Passes and TargetTransformInfo: Straw Man
On 7/16/2013 11:38 PM, Andrew Trick wrote: > Since introducing the new TargetTransformInfo analysis, there has been some confusion over the role of target heuristics in IR passes. A few patches have led to interesting discussions. > > To centralize the discussion, until we get some documentation and better APIs in place, let me throw out an oversimplified Straw Man for a new pass pipline.
2009 Mar 31
2
[LLVMdev] Static Profiling - GSoC 2009
Hello all, I would like to participate in this year's Google Summer of Code and I am sending you a short description of my proposal. I have written the formal proposal already and if someone is interested I can send him the pdf. One of the open projects in the LLVM list is to enhance LLVM with static profiling capabilities. LLVM already provides a unified structure for writing pro
2015 Jul 15
5
[LLVMdev] Improving loop vectorizer support for loops with a volatile iteration variable
Hi all, I would like to propose an improvement of the “almost dead” block elimination in Transforms/Local.cpp so that it will preserve the canonical loop form for loops with a volatile iteration variable. *** Problem statement Nested loops in LCALS Subset B (https://codesign.llnl.gov/LCALS.php) are not vectorized with LLVM -O3 because the LLVM loop vectorizer fails the test whether the loop
2009 Oct 09
3
[LLVMdev] Instructions that cannot be duplicated
Is inlining (which duplicates code) of functions containing OpenCL style barriers legal?or e.g. if you had some changed phase ordering where you had if (cond) { S1; } call user_func() // user_func has a barrier buried inside it. you do tail splitting if (cond) { S1; call user_func() } else { call user_func(); } now you inline -- oops now you might have a problem so do you want
2019 Aug 09
4
How to best deal with undesirable Induction Variable Simplification?
Hi Hal, I see. So LSR could theoretically counteract undesirable Ind Var transformations but it's not implemented at the moment? I think I've managed to come up with a small reproducer that can also exhibit similar problem on x86, here it is: https://godbolt.org/z/_wxzut As you can see, when rewriteLoopExitValues is not disabled Clang generates worse code due to additional spills,
2013 Jul 17
5
[LLVMdev] IR Passes and TargetTransformInfo: Straw Man
Since introducing the new TargetTransformInfo analysis, there has been some confusion over the role of target heuristics in IR passes. A few patches have led to interesting discussions. To centralize the discussion, until we get some documentation and better APIs in place, let me throw out an oversimplified Straw Man for a new pass pipline. It serves two purposes: (1) an overdue reorganization of