Displaying 20 results from an estimated 500 matches similar to: "Pattern not recognized as reduction"
2019 May 02
2
llvm is illegally vectorizing with a recurrence on skylake
Hi -- I have found a bug in an HPC code where llvm is vectorizing a loop on
Skylake that has an obvious recurrence. I derived a small test case based
on the original benchmark below:
/*****************************************************************/
static void __attribute__ ((always_inline)) one(
const int *restrict in, const int *const end,
const unsigned shift, int *const restrict index,
2018 Jun 05
2
How to get optimization remarks while testing with lnt in llvm
Hi, I'm new to llvm and am trying to run benchmarks from the test-suite
using lnt to check loop-vectorization for various benchmarks.
Test are compiling and executing fine, but I am not getting optimization
remarks while using flags like -Rpass-missed=loop-vectorize and
-Rpass-analysis=loop-vectorize
I've tried running it like this:
lnt runtest test-suite --sandbox SANDBOX --cc
2018 Sep 20
2
Vectorization width not correct using #pragma clang loop vectorize_width
Hello,
I m trying to set vector width using #pragma clang loop vectorize_width(32)
but i m getting width 8 for the following kernel;
#define M 128
#define N 128
#define SQRT_FUN(x) sqrtf(x)
int main(int argc, char** argv)
{
/* Variable declaration/allocation. */
double float_n = (double)N;
double data[N*M];
double corr[M*M];
double mean[M];
double stddev[M];
uint32_t
2016 Mar 07
3
Profile-based inlining status
Hello,
I'm learning how LLVM performs PGO (profile-guided optimizations) by using
the instrumentation-based profile build (-fprofile-instr-generate and
-fprofile-instr-use).
However, I found there is no difference in inlining behaviors between with
and without PGO for a few spec benchmarks by checking the emit optimization
reports (-Rpass=inline -Rpass-missed=inline -Rpass-analysis=inline).
2020 Sep 01
2
Vectorization of math function failed?
I've tried to do:
clang++ -O3 -march=native -mtune=native \
-Rpass=loop-vectorize,slp-vectorize
-Rpass-missed=loop-vectorize,slp-vectorize
-Rpass-analysis=loop-vectorize,slp-vectorize \
-ffast-math -ffp-model=fast -ffp-exception-behavior=ignore -ffp-contract=fast \
-c -o vec.o vec.cc
But I've got no feedback.
--
Alexandre Bique
2016 Oct 09
3
On Loop Distribution pass
Dear community,
Our team at IITH have been experimenting with loop-distribution pass in
LLVM. We see the following results on few benchmarks.
clang -O3 -mllvm -enable-loop-distribute -Rpass=loop-distribute file.c
clang -O3 -mllvm -enable-loop-distribute -Rpass-analysis=loop-distribute
file.c
TORCH
2020 Jun 24
2
Loop vectorization and unsafe floating point math
Hi llvm-dev!
We are doing some fuzzy testing using C program generators,
and one question that came up when generating a program with
both floating point arithmetic and loop pragmas was;
Is the loop vectorizer really allowed to vectorize a loop when
it can't prove that it is safe to reorder fp math, even if
there is a loop pragma that hints about a preferred width.
When reading here
2020 Sep 01
2
Should llvm optimize 1.0 / x ?
Hi Quentin,
You are correct, I could manage to get clang to use vrcpps, but not in
a satisfying way:
clang++ -O3 -march=native -mtune=native \
-Rpass=loop-vectorize -Rpass-missed=loop-vectorize
-Rpass-analysis=loop-vectorize \
-ffast-math -ffp-model=fast -ffp-exception-behavior=ignore -ffp-contract=fast \
-c -o vec.o vec.cc
0000000000000140 <_Z4fct4Dv4_f>:
140: c5 f8 53 c8
2016 May 11
4
Filter optimization remarks by the hotness of the code region
> On May 11, 2016, at 3:37 AM, Hal Finkel <hfinkel at anl.gov> wrote:
>
> ----- Original Message -----
>> From: "Adam Nemet" <anemet at apple.com>
>> To: "Hal Finkel" <hfinkel at anl.gov>
>> Cc: "llvm-dev (llvm-dev at lists.llvm.org)" <llvm-dev at lists.llvm.org>
>> Sent: Wednesday, May 11, 2016 1:15:42 AM
2016 Oct 10
2
On Loop Distribution pass
> On Oct 10, 2016, at 2:50 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>
>
> From: "Dangeti Tharun kumar via llvm-dev" <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>
> To: llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> Cc: "Santanu Das" <cs15mtech11018 at iith.ac.in <mailto:cs15mtech11018 at
2018 Aug 14
2
optimization remarks
Hi,
I am trying to compare the loop vectorizers effectiveness for different
targets relative to each other. That way, I am hoping to find loops that
are not vectorized - but could be - on my target by finding other
targets doing this successfully. With some luck, there might be
something in the Target files that could be fixed with improved
vectorization as a result...
I would like to do
2016 May 11
2
Filter optimization remarks by the hotness of the code region
Hi Hal,
> On May 10, 2016, at 5:39 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>
> Hi Adam,
>
> I think would be a really useful feature to have. I don't think that the backend should be responsible for filtering, but should pass the relative hotness information to the frontend. Given that these diagnostics are not just going to be used for -Rpass and friends, but also
2017 Oct 18
2
How to emit opt report when using LTO
Hi,
I'm using clang frontend.
I'm interested in some particular hot loop in my code and I emit a
report from vectorizer optimizations passes.
I receive nice output if passing -Rpass* flags as long as I'm building
without LTO?
But with -flto it just prints nothing.
Is there a way to emit opt reports when using LTO?
For now I can only approximate about whether my the loop will be
2017 Jun 27
2
Next steps for optimization remarks?
Adam, thanks for all the suggestions!
One nice aspect of the `-Rpass` family of options is that I can filter
based on what I want. If I only want to see which inlines I missed, I could
use `clang -Rpass-missed="inline"`, for example. On the other hand,
optimization remark YAML always include remarks from all passes (as far as
I can tell), which increases the amount of time it takes
2020 Sep 01
2
Vector evolution?
Hi,
Please consider the following loop:
using v4f32 = float __attribute__((__vector_size__(16)));
void fct6(v4f32 *x)
{
#pragma clang loop vectorize(enable)
for (int i = 0; i < 256; ++i)
x[i] = 7 * x[i];
}
After compiling it with:
clang++ -O3 -march=native -mtune=native \
-Rpass=loop-vectorize,slp-vectorize
-Rpass-missed=loop-vectorize,slp-vectorize
2016 Mar 02
4
Proposal for function vectorization and loop vectorization with function calls
Proposal for function vectorization and loop vectorization with function calls
==============================================================================
Intel Corporation (3/2/2016)
This is a proposal for an initial work towards Clang and LLVM implementation of
vectorizing a function annotated with OpenMP 4.5's "#pragma omp declare simd"
(named SIMD-enabled function) and its
2015 Feb 09
3
[LLVMdev] aarch64 status for generating SIMD instructions
% clang -S -O3 -mcpu=cortex-a57 -ffast-math -Rpass-analysis=loop-vectorize dot.c
dot.c:15:1: remark: loop not vectorized: value that could not be identified as
reduction is used outside the loop [-Rpass-analysis=loop-vectorize]
}
^
dot.c:15:1: note: could not determine the original source location for :0:0
I found “llvm-as < /dev/null | llc -march=aarch64 -mattr=help” which listed a
2019 Oct 02
2
vectorize.enable
Am Mi., 2. Okt. 2019 um 15:56 Uhr schrieb Finkel, Hal J. <hfinkel at anl.gov>:
> > It's done by the WarnMissedTransformation and just looks for
> > transformation metadata that is still in the IR after all passes that
> > should have transformed them have ran. That is, it does not know why
> > it is still there -- it could be because the LoopVectorize pass is not
2014 Jun 26
7
[LLVMdev] -gcolumn-info and PR 14106
For -Rpass, and other related uses, I am looking at enabling column info by
default. David pointed me at PR 14106, which seems to be the original
motivation for introducing -gcolumn-info. However, I am finding no
differences when using it on this test. I've tried building with/without
-gcolumn-info and found almost no difference in compile time (+0.4%):
$ /usr/bin/time clang -w -fno-builtin
2016 May 18
1
Optimization remarks for non-temporal stores
Hi,
There was a recent discussion on generating non-temporal stores automatically[1]. This is a hard problem to get right. What seems like a much easier but still useful problem to solve is to provide diagnostics to the user where NT stores *may* help. Then the user can add the corresponding builtins and see if they are beneficial.
My hope is that with the work to add profile-driven