search for: outerloop

Displaying 15 results from an estimated 15 matches for "outerloop".

2016 Sep 21
5
RFC: Extending LV to vectorize outerloops
Proposal for extending the Loop Vectorizer to handle Outer Loops ================================================================ Goal: ----- We propose to extend the innermost Loop Vectorizer to also handle outerloops (cf.[1]). Our aim is to best leverage the efforts already invested in the existing innermost Loop Vectorizer rather than introduce a separate pass dedicated to outerloop vectorization. This proposal will support explicit vector programming of loops and functions [2]. It also facilitates evaluating...
2015 Sep 11
5
[RFC] New pass: LoopExitValues
Hi Steve it seems the general consensus is that the patch feels like a work-around for a problem with LSR (and possibly other loop transformations) that introduces redundant instructions. It is probably best to file a bug and a few of your test cases. Thanks Gerolf > On Sep 10, 2015, at 4:37 PM, Steve King via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > On Thu, Sep 10, 2015
2015 Sep 21
2
[RFC] New pass: LoopExitValues
On Mon, Sep 21, 2015 at 11:13 AM, Wei Mi <wmi at google.com> wrote: > I have the same worry as Philip and Hal that the new LoopExitValues > pass may increase some live range significantly in certain cases > because it reuses value cross outerloop iterations. Like the following > hypothetical case, the value reuse will create a live range living > across loop2, loop3, .... But we can add some simple logic to obviate > such case. > Thanks Wei. Can you please give your ideas about logic to catch abuse of live ranges?
2020 Jul 01
6
[RFC] Compiled regression tests.
...ppended to the end > and MDNodes with same content uniqued, making it impossible to > associate particular MDNodes with specific functions. > > > Ideally the regression test would be robust and understandable, > achievable with two asserts in a unittest: > >     Loop &OuterLoop = *LI->begin(); >     ASSERT_TRUE(OuterLoop.isAnnotatedParallel()); >     Loop &InnerLoop = *OuterLoop.begin(); >     ASSERT_TRUE(InnerLoop.isAnnotatedParallel()); I definitely agree that we should not be trying to do this kind of checking using textual metadata-node matching in...
2020 Jun 24
6
[RFC] Compiled regression tests.
Am Mi., 24. Juni 2020 um 10:12 Uhr schrieb David Blaikie <dblaikie at gmail.com>: > > As mentioned in the Differential, generating the tests automatically > > will lose information about what actually is intended to be tested, > > Agreed - and I didn't mean to suggest tests should be automatically > generated. I work pretty hard in code reviews to encourage tests to
2015 Sep 21
4
[RFC] New pass: LoopExitValues
Hi Folks, Let's keep this optimization alive. To summarize: several folks voiced general support, but with questions about why existing optimizations do not already catch this case. Deep dive by Wei Mi showed that the optimization is most likely not a clean-up of LSR sloppiness, but something new. Follow-up by myself confirmed that the redundancy eliminated the LoopExitValues pass exists in
2020 Jul 01
5
[RFC] Compiled regression tests.
...n tests; metadata of all functions all appended to the end and MDNodes with same content uniqued, making it impossible to associate particular MDNodes with specific functions. Ideally the regression test would be robust and understandable, achievable with two asserts in a unittest: Loop &OuterLoop = *LI->begin(); ASSERT_TRUE(OuterLoop.isAnnotatedParallel()); Loop &InnerLoop = *OuterLoop.begin(); ASSERT_TRUE(InnerLoop.isAnnotatedParallel()); -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachmen...
2017 Dec 06
3
[RFC][LV][VPlan] Proposal for Outer Loop Vectorization Implementation Plan
...traction of the output IR, and the vector code generation is driven by this abstract representation.   This is a follow up of the previous RFCs and LLVM Developer Conference presentations:           http://lists.llvm.org/pipermail/llvm-dev/2016-September/105057.html (RFC: Extending LV to vectorize outerloops)           http://lists.llvm.org/pipermail/llvm-dev/2017-February/110159.html (RFC: Introducing VPlan to model the vectorized code and drive its transformation)           https://www.youtube.com/watch?v=XXAvdUwO7kQ (Extending LoopVectorizer: OpenMP4.5 SIMD and Outer Loop Auto-Vectorization)       ...
2017 Dec 06
5
[LV][VPlan] Status Update on VPlan ----- where we are currently, and what's ahead of us
...G eventually becomes the abstraction of the output IR, and the vector code generation is driven by this abstract representation.   Please refer to the following for more detailed background:   RFCs        http://lists.llvm.org/pipermail/llvm-dev/2016-September/105057.html (Extending LV to vectorize outerloops)        http://lists.llvm.org/pipermail/llvm-dev/2017-February/110159.html  (Introducing VPlan to model the vectorized code and drive its transformation)   "Extending LoopVectorizer: OpenMP4.5 SIMD and Outer Loop Auto-Vectorization"  (Saito, et.al.) 2016 LLVM Developers' Meeting http...
2017 Dec 14
3
[RFC][LV][VPlan] Proposal for Outer Loop Vectorization Implementation Plan
...nd the vector code generation is driven by this abstract representation. > > This is a follow up of the previous RFCs and LLVM Developer Conference presentations: >            > http://lists.llvm.org/pipermail/llvm-dev/2016-September/105057.html > (RFC: Extending LV to vectorize outerloops) >           > http://lists.llvm.org/pipermail/llvm-dev/2017-February/110159.html > (RFC: Introducing VPlan to model the vectorized code and drive its > transformation) >           https://www.youtube.com/watch?v=XXAvdUwO7kQ (Extending > LoopVectorizer: OpenMP4.5 SIMD and O...
2018 Jan 15
0
[RFC][LV][VPlan] Proposal for Outer Loop Vectorization Implementation Plan
...neration is driven by this abstract representation. >> >> This is a follow up of the previous RFCs and LLVM Developer Conference presentations: >> >> http://lists.llvm.org/pipermail/llvm-dev/2016-September/105057.html >> (RFC: Extending LV to vectorize outerloops) >> >> http://lists.llvm.org/pipermail/llvm-dev/2017-February/110159.html >> (RFC: Introducing VPlan to model the vectorized code and drive its >> transformation) >>           https://www.youtube.com/watch?v=XXAvdUwO7kQ (Extending >> LoopVectorize...
2007 Dec 02
2
Optimised qmf_synth and iir_mem16
...nd of the stack frame from here on, but we're not @ calling anything so it shouldn't matter @ Main loop, register usage: @ r0 = xx1, r1 = xx2, r2 = a, r3 = y, r4 = M, r5 = x10, r6 = x11, r7 = x20 @ r8 = x21, r9 = [a1, a0], r10 = acc0, r11 = acc1, r12 = acc2, r14 = acc3 0: @ Outerloop mov r10, #16384 @ Init acccumulators to rounding const mov r11, #16384 mov r12, #16384 mov r14, #16384 ldrsh r5, [r0, #-4]! @ r5 = x10, r0 = &xx1[N2 - 2] ldrsh r7, [r1, #-4]! @ r7 = x20, r1 = &xx2[N2 - 2] str...
2020 May 21
2
LV: predication
...fied example of that: for (i = 0; i < N; i++) { Sum = 0; M = Size - i; for (j = 0; j < M; j++) Sum += Input[j] * Input[j+i]; Output[i] = Sum; } We are vectorising the inner-loop and we need to know its BTC. Its loop upperbound M depends on outerloop i, which results in a recursive SCEV expression. %trip.count.minus.1 = sub i32 %1, 1 %broadcast.splatinsert = insertelement <4 x i32> undef, i32 %trip.count.minus.1, i32 0 %broadcast.splat = shufflevector <4 x i32> %broadcast.splatinsert, <4 x i32> undef, <4 x i32&...
2018 Jan 16
1
[RFC][LV][VPlan] Proposal for Outer Loop Vectorization Implementation Plan
...by this abstract representation. >>>    This is a follow up of the previous RFCs and LLVM Developer >>> Conference presentations: >>>             >>> http://lists.llvm.org/pipermail/llvm-dev/2016-September/105057.html >>> (RFC: Extending LV to vectorize outerloops) >>>             >>> http://lists.llvm.org/pipermail/llvm-dev/2017-February/110159.html >>> (RFC: Introducing VPlan to model the vectorized code and drive its >>> transformation) >>>             https://www.youtube.com/watch?v=XXAvdUwO7kQ (Extending &gt...
2020 May 20
2
LV: predication
Hi Ayal, Let me start with commenting on this: > A dedicated intrinsic that freezes the compare instruction, for no apparent reason, may potentially cripple subsequent passes from further optimizing the vectorized loop. The point is we have a very good reason, which is that it passes on the right information on the backend, enabling opimisations as opposed to crippling them. The compare