thr3ads.net - search: "loopvec"

Displaying 6 results from an estimated 6 matches for "loopvec".

Proposal for function vectorization and loop vectorization with function calls

2016 Mar 02

Proposal for function vectorization and loop vectorization with function calls

...is a proposal for an initial work towards Clang and LLVM implementation of vectorizing a function annotated with OpenMP 4.5's "#pragma omp declare simd" (named SIMD-enabled function) and its associated clauses based on the VectorABI [2]. On the caller side, we propose to improve LLVM loopVectorizer such that the code that calls the SIMD-enabled function can be vectorized. On the callee side, we propose to add Clang FE support for "#pragma omp declare simd" syntax and a new pass to transform the SIMD-enabled function body into a SIMD loop. This newly created loop can then be f...

Proposal for function vectorization and loop vectorization with function calls

2016 Mar 02

Proposal for function vectorization and loop vectorization with function calls

...itial work towards Clang and LLVM > implementation of vectorizing a function annotated with OpenMP 4.5's "#pragma omp declare simd" > (named SIMD-enabled function) and its associated clauses based on the > VectorABI [2]. On the caller side, we propose to improve LLVM > loopVectorizer such that the code that calls the SIMD-enabled function > can be vectorized. On the callee side, we propose to add Clang FE > support for "#pragma omp declare simd" syntax and a new pass to transform the SIMD-enabled function body into a SIMD loop. > This newly created l...

[LLVMdev] Loop vectorizer behaviour for 2D arrays and parallel annotation

2013 Apr 17

[LLVMdev] Loop vectorizer behaviour for 2D arrays and parallel annotation

...s progress lately. FWIW, pocl's OpenCL kernel compiler adds the metadata to work-item loops. That is, if your loop body was an OpenCL kernel with each work-item executing a single iteration, it *might* get "horizontally vectorized" using the loop vectorizer if you use pocl's 'loopvec' work group method and if the memory access pattern is suitable. This is quite fresh code which I'm still optimizing, but I've already managed to autovectorize some work groups using it. BR, -- Pekka

Adding support for vscale

2019 Oct 01

Adding support for vscale

On Tue, Oct 1, 2019 at 11:08 AM Graham Hunter <Graham.Hunter at arm.com> wrote: > Hi Luke, hi graham, thanks for responding in such an informative fashion. > > On 1 Oct 2019, at 09:21, Luke Kenneth Casson Leighton via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > typedef vec4 float[4]; // SEW=32,LMUL=4 probably > > static vec4 globalvec[1024]; // vscale ==

RFC: Implementing the Swift calling convention in LLVM and Clang

2016 Mar 02

RFC: Implementing the Swift calling convention in LLVM and Clang

> On Mar 2, 2016, at 1:33 AM, Renato Golin <renato.golin at linaro.org> wrote: > > On 2 March 2016 at 01:14, John McCall via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> Hi, all. >> - We sometimes want to return more values in registers than the convention normally does, and we want to be able to use both integer and floating-point registers. For

[LLVMdev] Loop vectorizer behaviour for 2D arrays and parallel annotation

2013 Apr 17

[LLVMdev] Loop vectorizer behaviour for 2D arrays and parallel annotation

Hello, I am trying to vectorize the following loop but the vectorizer says: "Found a possible write-write reorder" and does not vectorize. Why? for (j=0; j < 8; j++) { jj = j << 3; m2[j][0] = diff[jj ] + diff[jj+4]; m2[j][1] = diff[jj+1] + diff[jj+5]; m2[j][2] = diff[jj+2] + diff[jj+6]; m2[j][3] = diff[jj+3] + diff[jj+7]; m2[j][4] = diff[jj ] -

search for: loopvec