similar to: Polly loop offloading to Accelerator

Displaying 20 results from an estimated 2000 matches similar to: "Polly loop offloading to Accelerator"

2018 Jan 20
1
Polly loop offloading to Accelerator
Hello, i have been working with an accelerator backend. the accelerator has large vector/simd units. i want streaming loops (non-temporal) vectorized present in code to be offloaded to accelerator simd units. i find polly really suitable for this. i am thinking if the generated IR is passed to polly and then it analyzes loop to know it posses no reuse, if such loop is identified accelerator
2019 Oct 13
2
Replicate Individual O3 optimizations
Hello, I want to study the individual O3 optimizations. For this I am using following commands, but unable to replicate O3 behavior. 1. Documents/clang+llvm-9.0.0-x86_64-linux-gnu-ubuntu-18.04/bin/clang -O1 -Xclang -disable-llvm-passes -emit-llvm -S vecsum.c -o vecsum-noopt.ll 2. Documents/clang+llvm-9.0.0-x86_64-linux-gnu-ubuntu-18.04/bin/clang -O3 -mllvm -debug-pass=Arguments -emit-llvm -S
2019 Oct 19
3
Replicate Individual O3 optimizations
On Thu, Oct 17, 2019 at 11:22 AM David Greene via llvm-dev < llvm-dev at lists.llvm.org> wrote: > hameeza ahmed via llvm-dev <llvm-dev at lists.llvm.org> writes: > > > Hello, > > I want to study the individual O3 optimizations. For this I am using > > following commands, but unable to replicate O3 behavior. > > > > 1.
2018 May 15
1
Four bitcode generated with plugin-opt=save-temps
Hi Teresa Thanks for your very quick and clear explanation. I have one more question. The emit-llvm option will give you the IR for a single source file when you compile it with -c. All of those files when combined give the IR in the preopt.bc temp file. =========== So if I use "clang -emit-llvm -c" to generate the .ll file. It should be the same as the one I generated by using
2019 Oct 24
2
Replicate Individual O3 optimizations
I run matrix multiplication code with both the approaches o3 at clang and o3 at opt. clang o3 is about 2.97x faster than opt o3. On Mon, Oct 21, 2019 at 8:24 AM Neil Nelson <nnelson at infowest.com> wrote: > is_sorted.cpp > bool is_sorted(int *a, int n) { > > for (int i = 0; i < n - 1; i++) > > if (a[i] > a[i + 1]) > return false; > return
2018 May 15
0
Four bitcode generated with plugin-opt=save-temps
These are the bitcode at different stages of the LTO portion of the compile. LTO merges the IR for all files being linked and optimizes them as a single monolithic module. The preopt.bc is the merged IR just after merging and before performing any LTO optimizations. internalize.bc is after performing whole program internalization. opt.bc is after the optimization pipeline, and .precodegen.bc is
2019 Jan 16
3
Issues with using scalar evolution with newer versions of LLVM IR
Thank You.. I used following command to generate .bc or .ll /Documents/clang+llvm-4.0.0-x86_64-linux-gnu-ubuntu-16.04/bin/clang -O0 -emit-llvm -S -o vec4.ll vecsum.c /Documents/clang+llvm-7.0.0-x86_64-linux-gnu-ubuntu-16.04/bin/clang -O0 -emit-llvm -S -o vec7.ll vecsum.c On Wed, Jan 16, 2019 at 6:49 AM Sanjoy Das <sanjoy at playingwithpointers.com> wrote: > It is hard to tell
2016 Nov 10
2
Polly | Dependence detection details
Hi everyone, I'll be very thankful if anyone can help me. I want to extract the dependences details by using polly. I followed the following steps on example code matmul.c: 1. clang -S -emit-llvm matmul.c -o matmul.s 2. opt -S -polly-canonicalize matmul.s > matmul.preopt.ll 3. opt -basicaa -polly-dependences -analyze matmul.preopt.ll But it doesn't show me the dependences. I
2013 Aug 16
2
[LLVMdev] [Polly] Analysis of extra compile-time overhead for simple nested loops
At 2013-08-16 12:44:02,"Tobias Grosser" <tobias at grosser.es> wrote: >Hi, > >I tried to reproduce your findings, but could not do so. Sorry, I did not put all code in my previous email because the code seems a little too long and complicated. You can refer to the detailed C code and LLVM IR code on http://llvm.org/bugs/show_bug.cgi?id=16843 There are four attachments
2011 Nov 01
0
[LLVMdev] How to make Polly ignore some non-affine memory accesses
Mmm, this code seems to kill polly: #include <stdio.h> #include <stdlib.h> int main() { char *B; int i,j,k,h; const int x = 0, y=0; B = (char *)malloc(sizeof(char)*1024*1024); for (i = 1; i < 1024; i++) for (j = 1; j < 1024; j++) { if (i+j > 1000) B[j] = i; } printf("Random Value: %d", B[rand() % 1024*1024]); return 0; } running: opt
2018 May 15
2
Four bitcode generated with plugin-opt=save-temps
Hi I use the LDFLAGS=" -flto -fuse-ld=gold -Wl,-plugin-opt=save-temps " to generate the makefile and to make the whole program. However, found four different kinds of bitcode for each target. For example, I am compiling coreutils. For the program "nohup", I can get nohup.0.0.preopt.bc nohup.0.2.internalize.bc nohup.0.4.opt.bc nohup.0.5.precodegen.bc If I am right, I
2013 Aug 16
0
[LLVMdev] [Polly] Analysis of extra compile-time overhead for simple nested loops
On 08/16/2013 02:42 AM, Star Tan wrote: > At 2013-08-16 12:44:02,"Tobias Grosser" <tobias at grosser.es> wrote: >> Hi, >> >> I tried to reproduce your findings, but could not do so. > > > Sorry, I did not put all code in my previous email because the code seems a little too long and complicated. > You can refer to the detailed C code and LLVM IR
2013 Aug 16
0
[LLVMdev] [Polly] Analysis of extra compile-time overhead for simple nested loops
On 08/15/2013 03:32 AM, Star Tan wrote: > Hi all, Hi, I tried to reproduce your findings, but could not do so. > I have investigated the 6X extra compile-time overhead when Polly compiles the simple nestedloop benchmark in LLVM-testsuite. (http://188.40.87.11:8000/db_default/v4/nts/31?compare_to=28&baseline=28). Preliminary results show that such compile-time overhead is resulted by
2011 Oct 07
1
[LLVMdev] How to make Polly ignore some non-affine memory accesses
I add also the output of these commands: [hades at artemis examples]$ ./compile_ex.sh super_simple_loop Printing analysis 'Polly - Detect Scops in functions' for function 'main': [hades at artemis examples]$ modifying it in : #include <stdio.h> int main() { int A[1024]; int j, k=10; for (j = 0; j < 1024; j++) A[j] = k;
2011 Oct 08
0
[LLVMdev] How to make Polly ignore some non-affine memory accesses
On 10/07/2011 03:43 PM, Marcello Maggioni wrote: > 2011/10/7 Marcello Maggioni<hayarms at gmail.com>: >> Hi, >> >> for example this loop: >> >> #include<stdio.h> >> >> int main() >> { >> int A[1024]; >> int j, k=10; >> for (j = 1; j< 1024; j++) >> A[j] =
2015 Mar 13
1
[RFC PATCH v3] Intrinsics/RTCD related fixes. Mostly x86.
From: Jonathan Lennox <jonathan at vidyo.com> * Makes ?enable-intrinsics work with clang and other non-GCC compilers * Enables RTCD for the floating-point-mode SSE code in Celt. * Disables use of RTCD in cases where the compiler targets an instruction set by default. * Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in floating-point mode, not
2015 Mar 12
1
[RFC PATCHv2] Intrinsics/RTCD related fixes. Mostly x86.
From: Jonathan Lennox <jonathan at vidyo.com> * Makes ?enable-intrinsics work with clang and other non-GCC compilers * Enables RTCD for the floating-point-mode SSE code in Celt. * Disables use of RTCD in cases where the compiler targets an instruction set by default. * Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in floating-point mode, not
2013 Sep 30
0
[LLVMdev] [Polly] Move Polly's execution later
At 2013-09-25 18:03:18,"Tobias Grosser" <tobias at grosser.es> wrote:> >I think this is too early, as most of the canonicalization is not yet  >done. We probably don't need to investigate this bug immediately, but >it would be nice if we could make it reproducible without your changes  >to polly. For this please run the command with -debug-pass=Arguments
2018 Jan 29
2
Polly Dependency Analysis in MyPass
i put following line in CMakeLists.txt; add_subdirectory(mypass) then used make -j9 then i used following and run on canonicalize IR $ opt -load lib/LLVMmypass.so -mypass vec-sum.preopt.ll On Mon, Jan 29, 2018 at 9:39 PM, Michael Kruse <llvmdev at meinersbur.de> wrote: > 2018-01-29 10:18 GMT-06:00 hameeza ahmed <hahmed2305 at gmail.com>: > > I tried writing
2019 Jan 18
2
Is it possible to generate the IR representation with the original macro information?
Hi, I use the following commands to compile the IR. But I don't see the macro information in the .ll file. Is there a way to preserve the macro information (print() in this case) for debugging purposes? $ clang -std=gnu99 -g3 -flto -Wall -pedantic -c -o main.o main.c $ clang main.o -flto -fuse-ld=gold '-Wl,-plugin-opt=save-temps' -o main.exe $ llvm-dis main.exe.0.0.preopt.bc /* vim: