search for: vstn

Displaying 8 results from an estimated 8 matches for "vstn".

Did you mean: pstn
2016 Oct 10
2
[arm, aarch64] Alignment checking in interleaved access pass
Hi Renato, Thank you for the answers! First, let me clarify a couple of things and give some context. The patch it looking at VSTn, rather than VLDn (stores seem to be somewhat harder to get the "right" patterns, the pass is doing a good job for loads already) The examples you gave come mostly from loop vectorization, which, as I understand it, was the reason for adding the interleaved access pass. I'm looking a...
2016 Oct 10
2
[arm, aarch64] Alignment checking in interleaved access pass
On Mon, Oct 10, 2016 at 1:14 PM, Renato Golin <renato.golin at linaro.org> wrote: > On 10 October 2016 at 19:39, Alina Sbirlea <alina.sbirlea at gmail.com> > wrote: > > Now, for ARM archs Halide is currently generating explicit VSTn > intrinsics, > > with some of the patterns I described, and I found no reason why Halide > > shouldn't generate a single shuffle, followed by a generic vector store > and > > rely on the interleaved access pass to generate the right intrinsic. > > IIRC, the shuffl...
2016 May 26
2
enabling interleaved access loop vectorization
Is there a compile-time and/or potential runtime cost that makes enableInterleavedAccessVectorization() default to 'false'? I notice that this is set to true for ARM, AArch64, and PPC. In particular, I'm wondering if there's a reason it's not enabled for x86 in relation to PR27881: https://llvm.org/bugs/show_bug.cgi?id=27881 -------------- next part -------------- An HTML
2016 May 26
0
enabling interleaved access loop vectorization
...tice that this is set to true for ARM, AArch64, and PPC. > > In particular, I'm wondering if there's a reason it's not enabled for x86 in > relation to PR27881: > https://llvm.org/bugs/show_bug.cgi?id=27881 Hi Sanjay, The feature was originally developed for ARM's VLDn/VSTn instructions and then extended to AArch64 and PPC, but not x86/64 yet. I believe Elena was working on that, but needed to get the scatter/gather intrinsics working first. I just copied her in case I'm wrong. :) cheers, --renato
2016 Sep 19
3
[arm, aarch64] Alignment checking in interleaved access pass
Hi, As a follow up to Patch D23646 <https://reviews.llvm.org/D23646>, I'm trying to figure out if there should be an alignment check and what the correct approach is. Some background: For stores, the pass turns: %i.vec = shuffle <8 x i32> %v0, <8 x i32> %v1, <0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11> store <12 x i32> %i.vec, <12 x i32>* %ptr
2016 May 26
2
enabling interleaved access loop vectorization
...>> >> In particular, I'm wondering if there's a reason it's not enabled for >> x86 in relation to PR27881: >> https://llvm.org/bugs/show_bug.cgi?id=27881 > >Hi Sanjay, > >The feature was originally developed for ARM's VLDn/VSTn instructions >and then extended to AArch64 and PPC, but not x86/64 yet. > >I believe Elena was working on that, but needed to get the scatter/gather >intrinsics working first. I just copied her in case I'm wrong. :) > >cheers, >--renato ----------------...
2016 Aug 05
3
enabling interleaved access loop vectorization
...>> >> In particular, I'm wondering if there's a reason it's not enabled for >> x86 in relation to PR27881: >> https://llvm.org/bugs/show_bug.cgi?id=27881 > >Hi Sanjay, > >The feature was originally developed for ARM's VLDn/VSTn instructions >and then extended to AArch64 and PPC, but not x86/64 yet. > >I believe Elena was working on that, but needed to get the scatter/gather >intrinsics working first. I just copied her in case I'm wrong. :) > >cheers, >--renato ----------------...
2016 Aug 05
2
enabling interleaved access loop vectorization
...rticular, I'm wondering if there's a reason it's not enabled for > >> x86 in relation to PR27881: > >> https://llvm.org/bugs/show_bug.cgi?id=27881 > > > >Hi Sanjay, > > > >The feature was originally developed for ARM's VLDn/VSTn instructions > >and then extended to AArch64 and PPC, but not x86/64 yet. > > > >I believe Elena was working on that, but needed to get the > scatter/gather > >intrinsics working first. I just copied her in case I'm wrong. :) > > > >cheer...