thr3ads.net - search: "vldn"

[arm, aarch64] Alignment checking in interleaved access pass

2016 Sep 19

3

[arm, aarch64] Alignment checking in interleaved access pass

Hi, As a follow up to Patch D23646 <https://reviews.llvm.org/D23646>, I'm trying to figure out if there should be an alignment check and what the correct approach is. Some background: For stores, the pass turns: %i.vec = shuffle <8 x i32> %v0, <8 x i32> %v1, <0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11> store <12 x i32> %i.vec, <12 x i32>* %ptr

[arm, aarch64] Alignment checking in interleaved access pass

2016 Oct 10

2

[arm, aarch64] Alignment checking in interleaved access pass

Hi Renato, Thank you for the answers! First, let me clarify a couple of things and give some context. The patch it looking at VSTn, rather than VLDn (stores seem to be somewhat harder to get the "right" patterns, the pass is doing a good job for loads already) The examples you gave come mostly from loop vectorization, which, as I understand it, was the reason for adding the interleaved access pass. I'm looking at a different usec...

enabling interleaved access loop vectorization

2016 May 26

2

enabling interleaved access loop vectorization

Is there a compile-time and/or potential runtime cost that makes enableInterleavedAccessVectorization() default to 'false'? I notice that this is set to true for ARM, AArch64, and PPC. In particular, I'm wondering if there's a reason it's not enabled for x86 in relation to PR27881: https://llvm.org/bugs/show_bug.cgi?id=27881 -------------- next part -------------- An HTML

enabling interleaved access loop vectorization

2016 May 26

0

enabling interleaved access loop vectorization

...I notice that this is set to true for ARM, AArch64, and PPC. > > In particular, I'm wondering if there's a reason it's not enabled for x86 in > relation to PR27881: > https://llvm.org/bugs/show_bug.cgi?id=27881 Hi Sanjay, The feature was originally developed for ARM's VLDn/VSTn instructions and then extended to AArch64 and PPC, but not x86/64 yet. I believe Elena was working on that, but needed to get the scatter/gather intrinsics working first. I just copied her in case I'm wrong. :) cheers, --renato

enabling interleaved access loop vectorization

2016 May 26

2

enabling interleaved access loop vectorization

...PPC. >> >> In particular, I'm wondering if there's a reason it's not enabled for >> x86 in relation to PR27881: >> https://llvm.org/bugs/show_bug.cgi?id=27881 > >Hi Sanjay, > >The feature was originally developed for ARM's VLDn/VSTn instructions >and then extended to AArch64 and PPC, but not x86/64 yet. > >I believe Elena was working on that, but needed to get the scatter/gather >intrinsics working first. I just copied her in case I'm wrong. :) > >cheers, >--renato -----------...

[GSoC 2016] Code Generation Improvements task

2016 Feb 29

2

[GSoC 2016] Code Generation Improvements task

Hello LLVM Community, I am interested doing following project with LLVM for GSoC 2016. Code Generation Improvements: Particularly Generalize target-specific backend passes that could be target-independent I have done some initial study and try to understand the task to be done. Please help me to develop the proposal. Following are my initial findings : 1. lib/Target/Hexagon/RDF* : Code

[GSoC 2016] Code Generation Improvements task

2016 Mar 01

2

[GSoC 2016] Code Generation Improvements task

...e system. For example on ARM its NEON > similarly other architectures have SIMD support specifically MIPS, IBM > System Z, Power PC with MMX/AltiVee and x86 with Intel’s AVX. Possibly. It seems to rely pretty strongly on ARM's "load more than you can actually use" instructions: vldN instructions can load up to 4 128-bit vectors, but they can still only be used as 128-bit vectors. If other targets possess similar, then they could well benefit; if not, then it's probably pointless. > I have question regarding Target hooks. Does it means using TargetInfo an > SubTarget...

[GSoC 2016] Code Generation Improvements task

2016 Mar 01

0

[GSoC 2016] Code Generation Improvements task

...ARM its NEON > > similarly other architectures have SIMD support specifically MIPS, IBM > > System Z, Power PC with MMX/AltiVee and x86 with Intel’s AVX. > > Possibly. It seems to rely pretty strongly on ARM's "load more than > you can actually use" instructions: vldN instructions can load up to 4 > 128-bit vectors, but they can still only be used as 128-bit vectors. > If other targets possess similar, then they could well benefit; if > not, then it's probably pointless. > > > I have question regarding Target hooks. Does it means using Targ...

enabling interleaved access loop vectorization

2016 Aug 05

3

enabling interleaved access loop vectorization

...PPC. >> >> In particular, I'm wondering if there's a reason it's not enabled for >> x86 in relation to PR27881: >> https://llvm.org/bugs/show_bug.cgi?id=27881 > >Hi Sanjay, > >The feature was originally developed for ARM's VLDn/VSTn instructions >and then extended to AArch64 and PPC, but not x86/64 yet. > >I believe Elena was working on that, but needed to get the scatter/gather >intrinsics working first. I just copied her in case I'm wrong. :) > >cheers, >--renato -----------...

enabling interleaved access loop vectorization

2016 Aug 05

2

enabling interleaved access loop vectorization

...In particular, I'm wondering if there's a reason it's not enabled for > >> x86 in relation to PR27881: > >> https://llvm.org/bugs/show_bug.cgi?id=27881 > > > >Hi Sanjay, > > > >The feature was originally developed for ARM's VLDn/VSTn instructions > >and then extended to AArch64 and PPC, but not x86/64 yet. > > > >I believe Elena was working on that, but needed to get the > scatter/gather > >intrinsics working first. I just copied her in case I'm wrong. :) > > > >...

search for: vldn