similar to: [LLVMdev] Vectorization of loops with conditional dereferencing

Displaying 20 results from an estimated 10000 matches similar to: "[LLVMdev] Vectorization of loops with conditional dereferencing"

2013 Nov 01
0
[LLVMdev] Vectorization of loops with conditional dereferencing
Hi Hal, Yes, I agree that this is a problem that prevents vectorization in many loops. Another problem that we have is that sunk loads don’t preserve their control dependence properties. For example in the code below, if we sink the load into the branch then we can't vectorize the loop. x = A[i] if (cond) { sum += x; } I agree with you that checking the first and last element for each
2013 Nov 14
0
[LLVMdev] Vectorization of loops with conditional dereferencing
On 1 November 2013 13:40, Hal Finkel <hfinkel at anl.gov> wrote: > Done = false; > FirstI = -1, LastI = -1; > while (!Done) { > for (I = FirstI+1; I < N; ++I) > if (r[i] > 0) { > FirstI = I; > break; > } > > for (; I < N && !page_bound(&m[i]) && ...; ++I) { > if (r[i] > 0) > LastI = I; >
2013 Nov 01
1
[LLVMdev] Vectorization of loops with conditional dereferencing
----- Original Message ----- > Hi Hal, > > Yes, I agree that this is a problem that prevents vectorization in > many loops. Another problem that we have is that sunk loads don’t > preserve their control dependence properties. For example in the > code below, if we sink the load into the branch then we can't > vectorize the loop. > > x = A[i] > if (cond) { >
2013 Nov 14
3
[LLVMdev] Vectorization of loops with conditional dereferencing
I think that the best way to move forward with the vectorization of this loop is to make progress on the vectorization pragmas. The LoopVectorizer is already prepared for handling pragmas and we just need to add the clang-side support. Is anyone planning to work on this ? On Nov 14, 2013, at 2:18 AM, Renato Golin <renato.golin at linaro.org> wrote: > On 1 November 2013 13:40, Hal
2016 Aug 25
2
InstList insert depreciated?
Jon, > You want: > TaintVar->insertAfter(FirstI); This worked! Thank you. On Thu, Aug 25, 2016 at 9:38 AM, Jonathan Roelofs <jonathan at codesourcery.com> wrote: > > > On 8/25/16 7:01 AM, Shehbaz Jaffer via llvm-dev wrote: >> >> I tried an alternative way of adding instruction by first getting the >> first instruction of the basic block, and then
2016 Aug 25
2
InstList insert depreciated?
Hi llvm-devel, I have migrated my codebase from llvm-3.6 to llvm 3.8.1-stable. Although I was able to resolve most of the problems, I am facing issues resolving the following: To insert an instruction immediately after the first instruction within a basic block, I first get all instructions in my basic block in an instruction container list. Once that is done, I insert my new instruction in the
2011 Aug 24
2
Append a value to a vector
This should be easy but it does not work I have 3 vectors*(activeT,inactT, activeR)*, the idea is that if the last value in inactT is higher than the last in activeT this value has to be append in active T and the last value in another vector call activeR has to be repeated. (at the bottom you can find the vectors) I have done this: activeT=round(as.numeric(activeT)) inactT=
2018 Jun 01
2
[VPlan] about vectorization factor selection
Hi, Current loop vectorizer uses a range of vectorization factors computed by MaxVF. For each VF, it setups unform and scalar info before building VPlan and the final best VF selection. The best VF is also selected within the VF range. for (unsigned VF = 1; VF <= MaxVF; VF *= 2) { // Collect Uniform and Scalar instructions after vectorization with VF.
2017 Mar 15
4
[Proposal][RFC] Epilog loop vectorization
On 03/14/2017 07:50 PM, Adam Nemet wrote: > >> On Mar 14, 2017, at 11:33 AM, Hal Finkel <hfinkel at anl.gov >> <mailto:hfinkel at anl.gov>> wrote: >> >> >> >> On 03/14/2017 12:11 PM, Adam Nemet wrote: >>> >>>> On Mar 14, 2017, at 9:49 AM, Hal Finkel <hfinkel at anl.gov >>>> <mailto:hfinkel at
2001 Mar 12
2
Regressions with monotonicity constraints
This seems to be a recurrent topic, but I don't remember hearing a definitive answer. I also apologies for cross-posting. Say I have a numerical response variable and a bunch of multi-level factors I want to use for modeling. I don't expect factor interaction to be important so there will be no interactions in the model. All this would be a perfect job for ANOVA except for one additional
2017 Feb 27
4
[Proposal][RFC] Epilog loop vectorization
Thanks for looking into this. 1) Issues with re running vectorizer: Vectorizer might generate redundant alias checks while vectorizing epilog loop. Redundant alias checks are expensive, we like to reuse the results of already computed alias checks. With metadata we can limit the width of epilog loop, but not sure about reusing alias check result. Any thoughts on rerunning vectorizer with reusing
2017 Feb 23
2
[Proposal][RFC] Epilog loop vectorization
On 02/22/2017 11:52 AM, Adam Nemet via llvm-dev wrote: > Hi Ashutosh, > >> On Feb 22, 2017, at 1:57 AM, Nema, Ashutosh via llvm-dev >> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> Hi, >> This is a proposal about epilog loop vectorization. >> Currently Loop Vectorizer inserts an epilogue loop for handling loops
2018 Aug 02
2
Vectorizing remainder loop
Hi Hameeza, Aside from Ashutosh's patch..... When the vector width is that large, we can't keep vectorizing remainder like below. It'll be a huge code size if nothing else ---- hitting ITLB miss because of this is very bad, for example. VF=2048 // main vector loop VF=1024 // vectorized remainder 1 VF=512 // vectorized remainder 2 ... Vectorize remainder until trip count is
2018 Aug 03
2
Vectorizing remainder loop
>it cannot afford large size masks for large vectors So, even a standard way of vectorizing remainder in masked or unmasked fashion wouldn’t work, I suppose. Ouch. I suppose VPlan should be able to model this kind of gigantic remainder vector code (when the time comes). Not pretty at all, though. Now, be fully aware that Direction #2 is really a poor (or rather extremely poor) person’s
2016 Jun 15
8
[RFC] Allow loop vectorizer to choose vector widths that generate illegal types
Hello, Currently the loop vectorizer will, by default, not consider vectorization factors that would make it generate types that do not fit into the target platform's vector registers. That is, if the widest scalar type in the scalar loop is i64, and the platform's largest vector register is 256-bit wide, we will not consider a VF above 4. We have a command line option (-mllvm
2017 Mar 14
4
[Proposal][RFC] Epilog loop vectorization
On 03/14/2017 12:11 PM, Adam Nemet wrote: > >> On Mar 14, 2017, at 9:49 AM, Hal Finkel <hfinkel at anl.gov >> <mailto:hfinkel at anl.gov>> wrote: >> >> >> On 03/14/2017 11:21 AM, Adam Nemet wrote: >>> >>>> On Mar 14, 2017, at 6:00 AM, Nema, Ashutosh <Ashutosh.Nema at amd.com >>>> <mailto:Ashutosh.Nema at
2019 Sep 09
3
Vectorizing multiple exit loops
I've recently mentioned in a few places that I'm interested in enhancing the loop vectorizer to handle multiple exit loops, and have been asked to share plans.  This email is intended to a) share my current thinking and b) help spark discussion among interested parties.  I do need to warn that my near term plans for this have been delayed; I got pulled into an internal project
2016 Jun 16
2
[RFC] Allow loop vectorizer to choose vector widths that generate illegal types
Some thoughts: o To determine the VF for a loop with mixed data sizes, choosing the smallest ensures each vector register used is full, choosing the largest will minimize the number of vector registers used. Which one’s better, or some size in between, depends on the target’s costs for the vector operations, availability of registers and possibly control/memory divergence and trip count. “This is
2016 Jun 16
2
[RFC] Allow loop vectorizer to choose vector widths that generate illegal types
Hi Michael,  Thank you for working on this. The loop vectorizer tries a bunch of different vectorization factors and stops at the widest word size mostly because of compile time concerns. On every vectorization factors that we check we have to scan all of the instructions in the loop and make multiple calls into TTI. If you decide to increase the VF enumeration space then you will linearly
2016 Oct 06
2
LoopVectorizer -- generating bad and unhandled shufflevector sequence
Hi, I have experimented with enabling the LoopVectorizer for SystemZ. I have come across a loop which, when vectorized, seems to have been poorly generated. In short, there seems to be a completely unnecessary sequence of shufflevector instructions, that doesn't get optimized away anywhere. In other words, there is a shuffling so that leads back to the original vector: [0 1 2 3