search for: autovectorization

Displaying 20 results from an estimated 238 matches for "autovectorization".

2011 Jun 24
2
[LLVMdev] LLVM autovectorization support
I would like to know the status of the autovectorization support in LLVM. does LLVM have a loop dependence analysis, does LLVM have a infrastructure for autovectorization ? etc. Kind Regards Xin Tong -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110624/3dc3531...
2011 Jun 24
0
[LLVMdev] LLVM autovectorization support
On 24 June 2011 21:13, Xin Tong Utoronto <x.tong at utoronto.ca> wrote: > I would like to know the status of the autovectorization support in LLVM. >  does LLVM have a loop dependence analysis, does LLVM have a infrastructure > for autovectorization ? etc. Not yet, but it's getting there... http://polly.grosser.es/ cheers, --renato
2015 Jul 06
4
[LLVMdev] SPMD Autovectorizer
Hi, Are there any plans to integrate an autovectorizer for SPMD programs into LLVM? For example, there were previous discussions about integrating the whole function vectorizer (WFV) from Ralf Karrenberg into LLVM. Thanks, Zack -------------- next part -------------- An HTML attachment was scrubbed... URL:
2015 Jul 07
2
[LLVMdev] SPMD Autovectorizer
On 07/07/2015 01:32 PM, Renato Golin wrote: > Wouldn't OpenMP account for some of that? At least on a single > machine, could you have both parallel and simd optimisations done on > the same loop? The point in SPMD program description (e.g. CUDA or OpenCL C) autovectorization is to produce something like OpenMP parallel loops or SIMD pragmas automatically from the single thread/WI description, adhering to its barrier synchronization semantics etc. That is, the output of this pass could be also converted to OpenMP SIMD constructs, if wanted. In pocl's case the outpu...
2013 Jan 24
3
[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization
Hi, I started to play with the LoopVectorizer of LLVM trunk on the work-item loops produced by pocl's OpenCL C kernel compiler, in hopes of implementing multi-work-item work group autovectorization in a modular manner. The vectorizer seems to refuse to vectorize the loop if it sees multiple writes to the same memory object within the same iteration. In case of parallel loops such as the work-item loops, it could just assume vectorization is doable from the data dependency point of view -- no...
2013 Feb 01
0
[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization
...Jääskeläinen" <pekka.jaaskelainen at tut.fi>, "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>, "Nadav > Rotem" <nrotem at apple.com> > Sent: Friday, February 1, 2013 1:49:28 AM > Subject: Re: [LLVMdev] LoopVectorizer in OpenCL C work group autovectorization > > Hi Hal, > > On 1/31/13 6:47 PM, Hal Finkel wrote: > >>> In any case, since our own OpenCL driver is more of a > >>> proof-of-concept > >>> implementation and not very robust, I'd be willing to give it a > >>> try > >>&gt...
2013 Feb 01
1
[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization
Hi Hal, On 1/31/13 6:47 PM, Hal Finkel wrote: >>> In any case, since our own OpenCL driver is more of a >>> proof-of-concept >>> implementation and not very robust, I'd be willing to give it a try >>> to >>> integrate the current libWFV into pocl. This should boost >>> performance >>> quite a bit for many kernels without too much
2013 Jan 25
0
[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization
Hi Pekka, > Hi, > > I started to play with the LoopVectorizer of LLVM trunk > on the work-item loops produced by pocl's OpenCL C > kernel compiler, in hopes of implementing multi-work-item > work group autovectorization in a modular manner. > Thanks for checking the Loop Vectorizer, I am interested in hearing your feedback. The Loop Vectorizer does not fit here. OpenCL vectorization is completely different because the language itself is data-parallel. You don't need all of the legality checks that the loo...
2015 Jul 07
2
[LLVMdev] SPMD Autovectorizer
On 07/07/2015 09:30 PM, C Bergström wrote: > If you're going to "autopar" (turn a loop into a threads which run on > many cores or something) then please don't add a dependency on OMP. I wouldn't, but simply utilize the parallel loop metadata that was originally designed for this purpose. What is done with that MD is up to other passes. -- --Pekka
2013 Jan 25
0
[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization
...o: "Hal Finkel" <hfinkel at anl.gov> > Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>, "Nadav Rotem" <nrotem at apple.com> > Sent: Friday, January 25, 2013 8:14:57 AM > Subject: Re: [LLVMdev] LoopVectorizer in OpenCL C work group autovectorization > > On 01/25/2013 04:00 PM, Hal Finkel wrote: > > Based on this experience, can you propose some metadata that would > > allow > > this to happen (so that the LoopVectorizer would be generally > > useful for > > POCL)? I suspect this same metadata might be usefu...
2009 Apr 01
2
[LLVMdev] GSoC 2009: Auto-vectorization
...to be >> supported. Therefore my aim is to start with the most minimal >> implementation possible, to explore the difficulties encountered in >> the specific context of LLVM and to build a foundation from which future >> work can progress. > > There's two types of autovectorization, SLP (superword level > parallelism) and ILP (instruction level parallelism). You can do ILP > which is loop-ignorant autovectorization of straight-line code which > turns out to be the code that runs inside the loop. Pardon me. I've been disabused by Owen Anderson for writing the...
2013 Jan 31
0
[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization
...t; <pekka.jaaskelainen at tut.fi> > To: "Ralf Karrenberg" <Chareos at gmx.de> > Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Thursday, January 31, 2013 11:15:43 AM > Subject: Re: [LLVMdev] LoopVectorizer in OpenCL C work group autovectorization > > Hi Ralf, > > On 01/31/2013 05:44 PM, Ralf Karrenberg wrote: > > As for the current status, the loop vectorizer is only able to > > vectorize > > inner loops and (I think) does not handle function calls and memory > > operations well. This will prevent it fr...
2011 Nov 28
1
[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass
...>>> >>> Thanks again, >>> Hal >>> >>> On Thu, 2011-11-17 at 13:57 +0100, Tobias Grosser wrote: >>>> On 11/17/2011 12:38 AM, Hal Finkel wrote: >>>>> Tobias, et al., >>>>> >>>>> Attached is the my autovectorization pass. >>>> >>>> Very nice. Will you be at the developer summit? Maybe we could discuss >>>> the integration there? >>>> >>>> Here a first review of the source code. >>>> >> >> > > -- > Hal Finkel &g...
2011 Oct 29
0
[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass
On Sat, Oct 29, 2011 at 12:30:12PM -0500, Hal Finkel wrote: > Also, when using clang, I had to pass -Dinline= on the command line: > when using -emit-llvm, clang appears not to emit code for functions > declared inline. This is a bug, but I've not yet tracked it down. http://clang.llvm.org/compatibility.html#inline Thanks, -- Peter
2011 Nov 17
2
[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass
Hello Hal, > MultiSource/Applications/ClamAV - fails to compile shared_sha256.c with > an error: error in backend: Cannot select: 0x4fbcb40: v2i64 = > X86ISD::MOVLPD 0x4149e00, 0x418d930 [ID=596] Please report this as a PR regardless of the pass. Bugs in the backend should be fixed. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State
2011 Dec 20
2
[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass
Hi, I see that there are two functions in your code that are O(n^2) in number of instructions of the program: getCandidatePairs and buildDepMap. I think that you could make these two functions faster if you work on some form of factored def-use chains for memory, like the VUSE/VDEFs of GCC. I was trying to find a similar representation in LLVM: isn't there already a virtual SSA
2011 Dec 20
0
[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass
On Tue, 2011-12-20 at 13:57 -0600, Sebastian Pop wrote: > Hi, > > I see that there are two functions in your code that are O(n^2) in > number of instructions of the program: getCandidatePairs and > buildDepMap. I think that you could make these two functions faster > if you work on some form of factored def-use chains for memory, like > the VUSE/VDEFs of GCC. Thanks for the
2012 Jan 26
0
[LLVMdev] [llvm-commits] [PATCH] BasicBlock Autovectorization Pass
On Thu, Jan 26, 2012 at 2:49 PM, Hal Finkel <hfinkel at anl.gov> wrote: > Thanks! Did you compile with any non-default flags other than -mllvm > -vectorize? I used -O3 and -vectorize, no other non-default flags. Sebastian -- Qualcomm Innovation Center, Inc is a member of Code Aurora Forum
2015 Jul 07
2
[LLVMdev] SPMD Autovectorizer
...an ongoing experimental HSA support work). Adding a mode where some of the parallel loop iterations are executed in SIMD lanes and some in multiple cores with the target's supported threading mechanism is something to consider, but not yet done (in pocl). The original question was only about autovectorization so I'd not go there yet. OpenMP was just a side note from me, sorry for the possible confusion. > I'd be interested in knowing what kind of changes we'd need to get the > OMP+SIMD model into CL-type code, if that's what you're proposing... I'm not sure what you mean...
2012 Feb 03
0
[LLVMdev] Vectorization: Next Steps
Hi Hal, > As some of you may know, I committed my basic-block autovectorization > pass a few days ago. I encourage anyone interested to try it out (pass > -vectorize to opt or -mllvm -vectorize to clang) and provide feedback. > Especially in combination with -unroll-allow-partial, I have observed > some significant benchmark speedups, but, I have also observed some...