Displaying 20 results from an estimated 2000 matches similar to: "Query on unswitching + vectorization"
2018 May 11
0
Query on unswitching + vectorization
On 5/10/2018 10:44 PM, Gopalasubramanian, Ganesh via llvm-dev wrote:
>
> Hi,
>
> I am going through analysis on unswitching + vectorization.
>
> For the below test, llvm unswitches successfully but fails to
> vectorize the loop after unswitching.
>
> Llvm bails out saying “Found an outside user” apparently which is the
> value of ‘tmp’.
>
> int i, w, x[1000],
2018 May 14
1
Query on unswitching + vectorization
* Looks like some sort of pass ordering issue; it will vectorize if indvars runs sometime between loop unswitch and the vectorizer.
That insight is helpful. I scheduled Canonicalization of induction variable before loop vectorization and could get the loop vectorized.
The indvars are heavily dependent on SCEV. If there a scalar like tmp which is of real type, we may not be able to get the
2018 Apr 29
0
FYI, planning to enable nontrivial loop unswitch in the new PM at O3
Is there any written description of what "non trivialness" is there?
On Sun, Apr 29, 2018, 2:49 PM Chandler Carruth via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> One of the last big missing pieces for the new PM is enabling non-trivial
> loop unswitch at O3.
>
> The pass is now working well and passing all the testing I have done as
> well as some others'
2018 Apr 29
2
FYI, planning to enable nontrivial loop unswitch in the new PM at O3
One of the last big missing pieces for the new PM is enabling non-trivial
loop unswitch at O3.
The pass is now working well and passing all the testing I have done as
well as some others' testing (thanks Fedor!) so it should be ready to be
enabled.
I've done preliminary benchmarking on the test suite and SPEC and haven't
seen any interesting regressions and quite a few improvements.
2017 Jul 17
2
A bug related with undef value when bootstrap MemorySSA.cpp
Cool, thanks for debugging this issue and letting us know.
We have a few patches to fix this issue:
- Introduce freeze in IR: https://reviews.llvm.org/D29011
- Lowering freeze: https://reviews.llvm.org/D29014
- Fix loop unswitch: https://reviews.llvm.org/D29015
Bonus patches to recover perf:
- Be less conservative in loop unswitching: https://reviews.llvm.org/D29016
- Instcombine support
2017 Jul 17
2
A bug related with undef value when bootstrap MemorySSA.cpp
The issue blocks another optimization patch and Wei has spent huge amount
of effort isolating the the bootstrap failure to this same problem. I agree
with Wei that other developers may also get hit by the same issue and the
cost of leaving this issue open for long can be very high to the community.
David
On Mon, Jul 17, 2017 at 10:01 AM, Wei Mi <wmi at google.com> wrote:
> Sanjoy and
2018 Apr 04
0
SCEV and LoopStrengthReduction Formulae
> cmpq %rbx, %r14
> jne .LBB0_1
>
> LLVM can perform compare-jump fusion, it already does in certain cases, but
> not in the case above. We can remove the cmp above if we were to perform
> the following transformation:
Do you mean branch-fusion (https://en.wikichip.org/wiki/macro-operation_fusion)?
Is there any more limitation why these two or not fused?
> -----Original
2019 Aug 08
3
How to best deal with undesirable Induction Variable Simplification?
Hello,
Recently I've come across two instances where Induction Variable Simplification lead to noticable performance regressions.
In one case, the removal of extra IV lead to the inability to reschedule instructions in a tight loop to reduce stalls. In that case, there were enough registers to spare, so using extra register for extra induction variable was preferable since it reduced
2014 Apr 07
2
[LLVMdev] Loop unswitching creates dead code
Hi,
I'm surprised by the result of compiling the following lines of code:
for (int i = 0; i < RANDOM_CHUNKS; i++) {
for (int j = 0; j < RANDOM_CHUNK_SIZE; j++) {
random_text[i][j] = (int)(ran()*256);
}
}
The problem happens when -fsanitize=undefined, -fno-sanitize-recover and
-O3 are enabled. In this case, UndefinedBehaviorSanitizer inserts check for
array index out of
2015 Jul 15
5
[LLVMdev] Improving loop vectorizer support for loops with a volatile iteration variable
Hi all,
I would like to propose an improvement of the “almost dead” block
elimination in Transforms/Local.cpp so that it will preserve the canonical
loop form for loops with a volatile iteration variable.
*** Problem statement
Nested loops in LCALS Subset B (https://codesign.llnl.gov/LCALS.php) are
not vectorized with LLVM -O3 because the LLVM loop vectorizer fails the
test whether the loop
2015 Sep 04
9
[RFC] Refinement of convergent semantics
Hi all,
In light of recent discussions regarding updating passes to respect convergent semantics, and whether or not it is sufficient for barriers, I would like to propose a change in convergent semantics that should resolve a lot of the identified problems regarding loop unrolling, loop unswitching, etc. Credit to John McCall for talking this over with me and seeding the core ideas.
Today,
2015 Jul 16
2
[LLVMdev] Improving loop vectorizer support for loops with a volatile iteration variable
----- Original Message -----
> From: "Chandler Carruth" <chandlerc at google.com>
> To: "Hyojin Sung" <hsung at us.ibm.com>, llvmdev at cs.uiuc.edu
> Sent: Wednesday, July 15, 2015 7:34:54 PM
> Subject: Re: [LLVMdev] Improving loop vectorizer support for loops
> with a volatile iteration variable
> On Wed, Jul 15, 2015 at 12:55 PM Hyojin Sung
2015 Sep 14
2
[RFC] Refinement of convergent semantics
> On Sep 14, 2015, at 12:15 PM, Philip Reames <listmail at philipreames.com> wrote:
>
> On 09/04/2015 01:25 PM, Owen Anderson via llvm-dev wrote:
>> Hi all,
>>
>> In light of recent discussions regarding updating passes to respect convergent semantics, and whether or not it is sufficient for barriers, I would like to propose a change in convergent semantics that
2015 Jul 16
2
[LLVMdev] Improving loop vectorizer support for loops with a volatile iteration variable
----- Original Message -----
> From: "Chandler Carruth" <chandlerc at google.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "Hyojin Sung" <hsung at us.ibm.com>, llvmdev at cs.uiuc.edu
> Sent: Thursday, July 16, 2015 1:06:03 AM
> Subject: Re: [LLVMdev] Improving loop vectorizer support for loops
> with a volatile iteration
2018 Apr 03
4
SCEV and LoopStrengthReduction Formulae
I am attempting to implement a minor loop strength reduction optimization for
targets that support compare and jump fusion, specifically
TTI::canMacroFuseCmp(). My approach might be wrong; however, I am soliciting
the idea for feedback, so that I can implement this correctly. My plan is to
add a Supplemental LSR formula to LoopStrengthReduce.cpp that optimizes the
following case, but perhaps
2017 Jul 18
4
A bug related with undef value when bootstrap MemorySSA.cpp
On Mon, Jul 17, 2017 at 5:11 PM, Wei Mi <wmi at google.com> wrote:
> On Mon, Jul 17, 2017 at 2:09 PM, Sanjoy Das
> <sanjoy at playingwithpointers.com> wrote:
>> Hi,
>>
>> On Mon, Jul 17, 2017 at 1:56 PM, Daniel Berlin via llvm-dev
>> <llvm-dev at lists.llvm.org> wrote:
>>>
>>>
>>> On Mon, Jul 17, 2017 at 1:53 PM, Wei Mi
2019 Aug 09
4
How to best deal with undesirable Induction Variable Simplification?
Hi Hal,
I see. So LSR could theoretically counteract undesirable Ind Var transformations but it's not implemented at the moment?
I think I've managed to come up with a small reproducer that can also exhibit similar problem on x86, here it is: https://godbolt.org/z/_wxzut
As you can see, when rewriteLoopExitValues is not disabled Clang generates worse code due to additional spills,
2015 Sep 22
2
[RFC] Refinement of convergent semantics
Hi Jingyue,
I consider it a very important element of the design of convergent that it does not require baseline LLVM to contain a definition of uniformity, which would itself pull in a definition of SIMT/SPMD, warps, threads, etc. The intention is that it should be a conservative (but hopefully not too conservative) approximation, and that implementations of specific GPU programming models
2013 Jul 29
3
[LLVMdev] IR Passes and TargetTransformInfo: Straw Man
On Jul 29, 2013, at 9:05 AM, Krzysztof Parzyszek <kparzysz at codeaurora.org> wrote:
> On 7/16/2013 11:38 PM, Andrew Trick wrote:
>> Since introducing the new TargetTransformInfo analysis, there has been some confusion over the role of target heuristics in IR passes. A few patches have led to interesting discussions.
>>
>> To centralize the discussion, until we get
2018 Feb 22
3
Loop splitting as a special case of unswitch
For the example code below,
int L = M + 10;
for (k = 1 ; k <=L; k++) {
dummy();
if (k < M)
dummy2();
}
, we can split the loop into two parts like :
for (k = 1 ; k != M; k++) {
dummy();
dummy2();
}
for (; k <=L; k++) {
dummy();
}
By splitting the loop, we can remove the conditional block in the loop and indirectly increase vectorization