similar to: [LLVMdev] LoopVectorizer in OpenCL C work group autovectorization

Displaying 20 results from an estimated 10000 matches similar to: "[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization"

2013 Jan 25
0
[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization
Hi Pekka, > Hi, > > I started to play with the LoopVectorizer of LLVM trunk > on the work-item loops produced by pocl's OpenCL C > kernel compiler, in hopes of implementing multi-work-item > work group autovectorization in a modular manner. > Thanks for checking the Loop Vectorizer, I am interested in hearing your feedback. The Loop Vectorizer does not fit here.
2013 Jan 25
4
[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization
On 01/25/2013 09:56 AM, Nadav Rotem wrote: > Thanks for checking the Loop Vectorizer, I am interested in hearing your > feedback. The Loop Vectorizer does not fit here. OpenCL vectorization is > completely different because the language itself is data-parallel. You > don't need all of the legality checks that the loop vectorizer has. I'm aware of this and it was my point in
2013 Jan 25
0
[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization
Hi Pekka, > How I see it, the data parallel input simply makes the vectorizer's job > easier (skip some of the legality checks) while reusing most of the > implementation (e.g. cost estimation, unrolling decisions, the > vector instruction formation itself, predication/if-conversion, > speculative execution+blend, etc.). > What you need is outer loop vectorization while
2013 Jan 25
0
[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization
----- Original Message ----- > From: "Pekka Jääskeläinen" <pekka.jaaskelainen at tut.fi> > To: "Nadav Rotem" <nrotem at apple.com> > Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Friday, January 25, 2013 5:35:16 AM > Subject: Re: [LLVMdev] LoopVectorizer in OpenCL C work group autovectorization > > On
2013 Jan 31
3
[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization
Hi Ralf, On 01/31/2013 05:44 PM, Ralf Karrenberg wrote: > As for the current status, the loop vectorizer is only able to vectorize > inner loops and (I think) does not handle function calls and memory > operations well. This will prevent it from vectorizing a large group of > OpenCL kernels, and certainly all "interesting", more complex ones. Agreed -- but not being able to
2013 Jan 31
0
[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization
Hi Pekka, hi Nadav, I didn't find the time to read this thread until now, sorry for that. I actually think you are both right :). As for the current status, the loop vectorizer is only able to vectorize inner loops and (I think) does not handle function calls and memory operations well. This will prevent it from vectorizing a large group of OpenCL kernels, and certainly all
2013 Jan 25
2
[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization
> I am in favor of adding metadata to control different aspects of > vectorization, mainly for supporting user-level pargmas [1] but also for > DSLs. Before we start adding metadata to the IR we need to define the > semantics of the tags. "Parallel_for" is too general. We also want to control > vectorization factor, unroll factor, cost model, etc. These are used to
2013 Jan 25
4
[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization
On 01/25/2013 04:21 PM, Hal Finkel wrote: > My point is that I specifically think that you should try it. I'm curious > to see how what you come up with might apply to other use cases as well. OK, attached is the first quick attempt towards this. I'm not proposing committing this, but would like to get comments to possibly move towards something committable. It simply looks for a
2013 Jan 31
0
[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization
----- Original Message ----- > From: "Pekka Jääskeläinen" <pekka.jaaskelainen at tut.fi> > To: "Ralf Karrenberg" <Chareos at gmx.de> > Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Thursday, January 31, 2013 11:15:43 AM > Subject: Re: [LLVMdev] LoopVectorizer in OpenCL C work group autovectorization > > Hi
2013 Jan 25
2
[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization
On 01/25/2013 04:00 PM, Hal Finkel wrote: > Based on this experience, can you propose some metadata that would allow > this to happen (so that the LoopVectorizer would be generally useful for > POCL)? I suspect this same metadata might be useful in other contexts (such > as implementing iteration-independence pragmas). I cannot yet. In this hack I simply changed LoopVectorizer to
2013 Jan 25
0
[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization
----- Original Message ----- > From: "Pekka Jääskeläinen" <pekka.jaaskelainen at tut.fi> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>, "Nadav Rotem" <nrotem at apple.com> > Sent: Friday, January 25, 2013 8:14:57 AM > Subject: Re: [LLVMdev] LoopVectorizer in OpenCL
2013 Feb 01
1
[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization
Hi Hal, On 1/31/13 6:47 PM, Hal Finkel wrote: >>> In any case, since our own OpenCL driver is more of a >>> proof-of-concept >>> implementation and not very robust, I'd be willing to give it a try >>> to >>> integrate the current libWFV into pocl. This should boost >>> performance >>> quite a bit for many kernels without too much
2013 Jan 25
0
[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization
Pekka, I am in favor of adding metadata to control different aspects of vectorization, mainly for supporting user-level pargmas [1] but also for DSLs. Before we start adding metadata to the IR we need to define the semantics of the tags. "Parallel_for" is too general. We also want to control vectorization factor, unroll factor, cost model, etc. Doug Gregor suggested to add the
2013 Jan 28
0
[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization
Pekka Jääskeläinen wrote: > On 01/25/2013 04:21 PM, Hal Finkel wrote: >> My point is that I specifically think that you should try it. I'm curious >> to see how what you come up with might apply to other use cases as well. > > OK, attached is the first quick attempt towards this. I'm not > proposing committing this, but would like to get comments > to possibly
2015 Jul 06
4
[LLVMdev] SPMD Autovectorizer
Hi, Are there any plans to integrate an autovectorizer for SPMD programs into LLVM? For example, there were previous discussions about integrating the whole function vectorizer (WFV) from Ralf Karrenberg into LLVM. Thanks, Zack -------------- next part -------------- An HTML attachment was scrubbed... URL:
2013 Feb 01
0
[LLVMdev] LoopVectorizer in OpenCL C work group autovectorization
----- Original Message ----- > From: "Ralf Karrenberg" <Chareos at gmx.de> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "Pekka Jääskeläinen" <pekka.jaaskelainen at tut.fi>, "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>, "Nadav > Rotem" <nrotem at apple.com> > Sent: Friday, February 1, 2013
2013 Oct 25
2
[LLVMdev] Is there pass to break down <4 x float> to scalars
Hi, Great to see someone working on this. This will benefit the performance portability goal of the pocl's OpenCL kernel compiler. It has been one of the low hanging fruits in improving its implicit WG vectorization applicability. The use case there is that sometimes it makes sense to devectorize the explicitly used vector datatype code of OpenCL kernels in order to make better opportunities
2013 Feb 19
4
[LLVMdev] Pointer Context Metadata (was: Parallel Loop Metadata)
----- Original Message ----- > From: "Pekka Jääskeläinen" <pekka.jaaskelainen at tut.fi> > To: "Nadav Rotem" <nrotem at apple.com> > Cc: "Hal Finkel" <hfinkel at anl.gov>, "Tobias Grosser" <tobias at grosser.es>, "llvmdev at cs.uiuc.edu Dev" > <llvmdev at cs.uiuc.edu> > Sent: Tuesday, February 19, 2013
2011 Oct 19
6
[LLVMdev] ANN: libclc (OpenCL C library implementation)
Hi everybody, the compiler design lab at Saarland University (chair of Sebastian Hack) is also working on an LLVM-based OpenCL driver. The project started as a use-case for our "Whole-Function Vectorization" library, which allows to transform a function to compute the same as W executions of the original code by using SIMD instructions (W = 4 for SSE/AltiVec, 8 for AVX). The
2011 Oct 19
1
[LLVMdev] ANN: libclc (OpenCL C library implementation)
Hi Micah, The numbers from the paper were measured with the ATI Stream SDK v2.1 (it's only mentioned in the references I think). The most recent measurements I have were done with the current v2.5. Best, Ralf Am 19.10.2011 23:43, schrieb Villmow, Micah: > Ralf, > What version of the SDK were you using for your analysis? I don't see that in the slides/pdf. > > Thanks, >