Displaying 20 results from an estimated 30000 matches similar to: "[LLVMdev] [BBVectorizer] Obvious vectorization benefit, but req-chain is too short"
2012 Feb 04
0
[LLVMdev] [BBVectorizer] Obvious vectorization benefit, but req-chain is too short
On Fri, 2012-02-03 at 10:28 +0100, Tobias Grosser wrote:
> Hi Hal,
>
> this is one of the first test cases, I would love to have improved
> vectorizer support. I sent it out earlier, but I think it is a good time
> to look into it again, after the vectorizer was committed.
>
> The basic examples is a set of scalar loads that load for consecutive
> elements and store
2012 Feb 04
1
[LLVMdev] [BBVectorizer] Obvious vectorization benefit, but req-chain is too short
Hello,
Thanks for your work on the bb-vectorizer. It looks like a
promising pass to be used for multi-work-item-vectorization in
pocl.
On 02/04/2012 06:21 AM, Hal Finkel wrote:
> Try it now (after r149761). If this "solution" causes other problems,
> then we may need to think of something more sophisticated.
I wonder if the case where a store is the last user of the value could
2011 Nov 23
0
[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass
On Mon, 2011-11-21 at 21:22 -0600, Hal Finkel wrote:
> On Mon, 2011-11-21 at 11:55 -0600, Hal Finkel wrote:
> > Tobias,
> >
> > I've attached an updated patch. It contains a few bug fixes and many
> > (refactoring and coding-convention) changes inspired by your comments.
> >
> > I'm currently trying to fix the bug responsible for causing a compile
2011 Nov 22
5
[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass
On Mon, 2011-11-21 at 11:55 -0600, Hal Finkel wrote:
> Tobias,
>
> I've attached an updated patch. It contains a few bug fixes and many
> (refactoring and coding-convention) changes inspired by your comments.
>
> I'm currently trying to fix the bug responsible for causing a compile
> failure when compiling
>
2013 Jul 04
3
[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR
Hi,
Our DSL can generate C or directly generate LLVM IR. With LLVM 3.3, we can vectorize the C produced code using clang with -O3, or clang with -O1 then opt -O3 -vectorize-loops. But the same program generating LLVM IR version cannot be vectorized with opt -O3 -vectorize-loops. So our guess is that our generated LLVM IR lacks some informations that are needed by the vectorization passes to
2011 Dec 02
5
[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass
On 11/23/2011 05:52 PM, Hal Finkel wrote:
> On Mon, 2011-11-21 at 21:22 -0600, Hal Finkel wrote:
>> > On Mon, 2011-11-21 at 11:55 -0600, Hal Finkel wrote:
>>> > > Tobias,
>>> > >
>>> > > I've attached an updated patch. It contains a few bug fixes and many
>>> > > (refactoring and coding-convention) changes inspired
2013 Jul 05
0
[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR
On 07/04/2013 01:39 PM, Stéphane Letz wrote:
> Hi,
>
> Our DSL can generate C or directly generate LLVM IR. With LLVM 3.3, we can vectorize the C produced code using clang with -O3, or clang with -O1 then opt -O3 -vectorize-loops. But the same program generating LLVM IR version cannot be vectorized with opt -O3 -vectorize-loops. So our guess is that our generated LLVM IR lacks some
2013 Jul 05
2
[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR
Le 5 juil. 2013 à 04:11, Tobias Grosser <tobias at grosser.es> a écrit :
> On 07/04/2013 01:39 PM, Stéphane Letz wrote:
>> Hi,
>>
>> Our DSL can generate C or directly generate LLVM IR. With LLVM 3.3, we can vectorize the C produced code using clang with -O3, or clang with -O1 then opt -O3 -vectorize-loops. But the same program generating LLVM IR version cannot be
2014 Sep 19
3
[LLVMdev] [Vectorization] Mis match in code generated
Hi Arnold,
Thanks for your reply.
I tried test case as suggested by you.
*void foo(int *a, int *sum) {*sum =
a[0]+a[1]+a[2]+a[3]+a[4]+a[5]+a[6]+a[7]+a[8]+a[9]+a[10]+a[11]+a[12]+a[13]+a[14]+a[15];}*
so that it has a 'store' in its IR.
*IR before vectorization :*target datalayout =
"e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128"
target triple =
2014 Nov 10
2
[LLVMdev] [Vectorization] Mis match in code generated
Hi Suyog,
Thanks for looking at this. This has recently got itself onto my TODO list
too.
> I am not sure how much all this will improve the code quality for
horizontal reduction
> (donno how frequently such pattern of horizontal reduction from same
array occurs in real world/SPECS).
Actually the main loop of 470.lbm can be SLP vectorized like this. We have
three parts to it: A fully
2013 Jul 05
0
[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR
On Jul 5, 2013, at 9:50 AM, Stéphane Letz <letz at grame.fr> wrote:
>
> Le 5 juil. 2013 à 04:11, Tobias Grosser <tobias at grosser.es> a écrit :
>
>> On 07/04/2013 01:39 PM, Stéphane Letz wrote:
>>> Hi,
>>>
>>> Our DSL can generate C or directly generate LLVM IR. With LLVM 3.3, we can vectorize the C produced code using clang with -O3, or
2013 Oct 30
3
[LLVMdev] loop vectorizer
----- Original Message -----
>
>
> I ran the BB vectorizer as I guess this is the SLP vectorizer.
No, while the BB vectorizer is doing a form of SLP vectorization, there is a separate SLP vectorization pass which uses a different algorithm. You can pass -vectorize-slp to opt.
-Hal
>
> BBV: using target information
> BBV: fusing loop #1 for for.body in _Z3barmmPfS_S_...
2013 Oct 30
0
[LLVMdev] loop vectorizer
The SLP vectorizer apparently did something in the prologue of the
function (where storing of arguments on the stack happens) which then
got eliminated later on (since I don't see any vector instructions in
the final IR). Below the debug output of the SLP pass:
Args: opt -O1 -vectorize-slp -debug loop.ll -S
SLP: Analyzing blocks in _Z3barmmPfS_S_.
SLP: Found 2 stores to vectorize.
SLP:
2013 Oct 30
2
[LLVMdev] loop vectorizer
The debug messages are misleading. They should read “trying to vectorize a list of …”; The problem is that the SCEV analysis is unable to detect that C[ir0] and C[ir1] are consecutive. Is this loop from an important benchmark ?
Thanks,
Nadav
On Oct 30, 2013, at 11:13 AM, Frank Winter <fwinter at jlab.org> wrote:
> The SLP vectorizer apparently did something in the prologue of the
2013 Jun 26
0
[LLVMdev] [llvm] r184698 - Add a flag to defer vectorization into a phase after the inliner and its
Sent from my iPhone...
On Jun 25, 2013, at 8:14 AM, Hal Finkel <hfinkel at anl.gov> wrote:
> ----- Original Message -----
>>
>>
>>
>> On Jun 24, 2013, at 4:24 PM, Hal Finkel < hfinkel at anl.gov > wrote:
>>
>>
>>
>>
>> Indvars should ideally preserve NSW flags whenever possible. However,
>> we don't want to
2013 Oct 30
3
[LLVMdev] loop vectorizer
On 30 October 2013 09:25, Nadav Rotem <nrotem at apple.com> wrote:
> The access pattern to arrays a and b is non-linear. Unrolled loops are
> usually handled by the SLP-vectorizer. Are ir0 and ir1 consecutive for all
> values for i ?
>
Based on his list of values, it seems that the induction stride is linear
within each block of 4 iterations, but it's not a clear
2013 Nov 01
2
[LLVMdev] loop vectorizer: this loop is not worth vectorizing
I am trying a setup where the one loop is rewritten as two loops. This
avoids the 'rem' and 'div' instructions in the index calculation (which
give the loop vectorizer a hard time).
However, with this setup the loop vectorizer complains about a too small
loop.
LV: Checking a loop in "main"
LV: Found a loop: L3
LV: Found a loop with a very small trip count. This loop
2013 Oct 31
3
[LLVMdev] loop vectorizer misses opportunity, exploit
----- Original Message -----
>
> Hi Nadav,
>
> that's the whole point of it. I can't in general make the index
> calculation simpler. The example given is the simplest non-trivial
> index function that is needed. It might well be that it's that
> simple that the index calculation in this case can be thrown aways
> altogether and - as you say - be replaced by
2013 Jun 25
2
[LLVMdev] [llvm] r184698 - Add a flag to defer vectorization into a phase after the inliner and its
----- Original Message -----
>
>
>
> On Jun 24, 2013, at 4:24 PM, Hal Finkel < hfinkel at anl.gov > wrote:
>
>
>
>
> Indvars should ideally preserve NSW flags whenever possible. However,
> we don't want to rely on SCEV to preserve them. SCEV expressions are
> implicitly reassociated and uniqued in a flow-insensitive universe
> independent of the
2013 Nov 06
2
[LLVMdev] loop vectorizer: Unexpected extract/insertelement
The following IR implements the following nested loop:
for (int i = start ; i < end ; ++i )
for (int p = 0 ; p < 4 ; ++p )
a[i*4+p] = b[i*4+p] + c[i*4+p];
define void @main(i64 %arg0, i64 %arg1, i1 %arg2, i64 %arg3, float*
noalias %arg4, float* noalias %arg5, float* noalias %arg6) {
entrypoint:
br i1 %arg2, label %L0, label %L1
L0: