Displaying 11 results from an estimated 11 matches for "stripmine".
2018 Mar 08
1
[Polly] Reduced code analyzability moving from LLVM 3.9.0 to 5.0.1
...code that I am trying to optimize [1] adjusts a picture's colors to get
an Instagram-like effect.
To improve code analyzability on LLVM 3.9.0, I made the following changes:
- Improve SCoP detection through -polly-process-unprofitable
- Enable outer loop vectorization through -polly-vectorizer=stripmine,
disabling timeouts with -polly-dependences-computeout=0
- Avoid sign extensions by replacing all 32-bit ints with longs, as
Polly seems to model using 64-bit loop counters
- Avoid interrupting control flow through -ffast-math and moving mallocs
to the top of the code
So to compile, we have:
c...
2018 Mar 09
1
[Polly] Reduced code analyzability moving from LLVM 3.9.0 to 5.0.1
...ts a picture's colors to get
>> an Instagram-like effect.
>>
>> To improve code analyzability on LLVM 3.9.0, I made the following changes:
>> - Improve SCoP detection through -polly-process-unprofitable
>> - Enable outer loop vectorization through -polly-vectorizer=stripmine,
>> disabling timeouts with -polly-dependences-computeout=0
>> - Avoid sign extensions by replacing all 32-bit ints with longs, as
>> Polly seems to model using 64-bit loop counters
>> - Avoid interrupting control flow through -ffast-math and moving mallocs
>> to the t...
2017 Jul 01
3
Jacobi 5 Point Stencil Code not Vectorizing
...2017 12:30 PM, "hameeza ahmed" <hahmed2305 at gmail.com> wrote:
> I even tried polly but still my llvm IR does not contain vector
> instructions. i used the following command;
>
> clang -S -emit-llvm stencil.c -march=knl -O3 -mllvm -polly -mllvm
> -polly-vectorizer=stripmine -o stencil_poly.ll
>
> Please specify what is wrong with my code?
>
>
> On Sat, Jul 1, 2017 at 4:08 PM, hameeza ahmed <hahmed2305 at gmail.com>
> wrote:
>
>> Hello,
>>
>> I am trying to vectorize following stencil code;
>>
>> #include <std...
2017 Jul 01
2
Jacobi 5 Point Stencil Code not Vectorizing
Hello,
I am trying to vectorize following stencil code;
#include <stdio.h>
#define N 100351
// This function computes 2D-5 point Jacobi stencil
void stencil(int a[restrict][N])
{
int i, j, k;
for (k = 0; k < 100; k++)
{ for (i = 1; i <= N-2; i++)
{ for (j = 1; j <= N-2; j++)
{ a[i][j] = 0.25 * (a[i][j] + a[i-1][j] + a[i+1][j] + a[i][j-1] +
2017 Jul 01
2
Jacobi 5 Point Stencil Code not Vectorizing
...hmed2305 at gmail.com> wrote:
>>
>>> I even tried polly but still my llvm IR does not contain vector
>>> instructions. i used the following command;
>>>
>>> clang -S -emit-llvm stencil.c -march=knl -O3 -mllvm -polly -mllvm
>>> -polly-vectorizer=stripmine -o stencil_poly.ll
>>>
>>> Please specify what is wrong with my code?
>>>
>>>
>>> On Sat, Jul 1, 2017 at 4:08 PM, hameeza ahmed <hahmed2305 at gmail.com>
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> I a...
2017 Oct 23
3
Jacobi 5 Point Stencil Code not Vectorizing
...305@gmail.com</a>> wrote:<br />>>> I even tried polly but still my llvm IR does not contain vector instructions. i used the following command;<br />>>><br />>>> clang -S -emit-llvm stencil.c -march=knl -O3 -mllvm -polly -mllvm -polly-vectorizer=stripmine -o stencil_poly.ll<br />>>><br />>>> Please specify what is wrong with my code?<br />>>><br />>>> On Sat, Jul 1, 2017 at 4:08 PM, hameeza ahmed <<a href="mailto:hahmed2305@gmail.com">hahmed2305@gmail.com</a>> wr...
2017 Oct 24
3
Jacobi 5 Point Stencil Code not Vectorizing
...;> >>> I even tried polly but still my llvm IR does not contain vector
>> >>> instructions. i used the following command;
>> >>>
>> >>> clang -S -emit-llvm stencil.c -march=knl -O3 -mllvm -polly -mllvm
>> >>> -polly-vectorizer=stripmine -o stencil_poly.ll
>> >>>
>> >>> Please specify what is wrong with my code?
>> >>>
>> >>> On Sat, Jul 1, 2017 at 4:08 PM, hameeza ahmed <hahmed2305 at gmail.com>
>> >>> wrote:
>> >>>> Hello,
>>...
2016 May 02
2
[GSoC 2016] Attaining 90% of the turbo boost peak with a C version of Matrix-Matrix Multiplication
Hi Tobias,
according to [1], we can expect 90% of the turbo boost peak of the
processor with a C version of Matrix-Matrix Multiplication that is
similar to the one presented in [1]. In case of Intel Core i7-3820
SandyBridge, the theoretical maximal performance of the machine is
28.8 gflops and hence the expected number is 25,92 gflops.
However, in case of, for example, n = m = 1056 and k = 1024
2016 Feb 03
3
opt with Polly doesn't find the passes
...tiling
-polly-vectorizer -
Select the vectorization strategy
=none - No
Vectorization
=polly -
Polly internal vectorizer
=stripmine -
Strip-mine outer loops for the loop-vectorizer to trigger
(If I grep the polly source for 'canonicalize' I see the module pass
class PollyCanonicalize).
What am I missing?
Thanks,
Frank
2020 Jan 03
10
Writing loop transformations on the right representation is more productive
In the 2018 LLVM DevMtg [1], I presented some shortcomings of how LLVM
optimizes loops. In summary, the biggest issues are (a) the complexity
of writing a new loop optimization pass (including needing to deal
with a variety of low-level issues, a significant amount of required
boilerplate, the difficulty of analysis preservation, etc.), (b)
independent optimization heuristics and a fixed pass
2016 Sep 20
7
RFC: Implement variable-sized register classes
I have posted a patch that switches the API to one that supports this
(yet non-existent functionality) earlier:
https://reviews.llvm.org/D24631
The comments from that were incorporated into the following RFC.
Motivation:
Certain targets feature "variable-sized" registers, i.e. a situation
where the register size can be configured by a hardware switch. A
common instruction set