search for: expandload

Displaying 9 results from an estimated 9 matches for "expandload".

2016 Sep 19
2
RFC: New intrinsics masked.expandload and masked.compressstore
...= PassThruV[i]; On this poster ( http://llvm.org/devmtg/2013-11/slides/Demikhovsky-Poster.pdf ) you'll find depicted "compress" and "expand" patterns. The RFC proposes to support this functionality by introducing two intrinsics to LLVM IR: llvm.masked.expandload.* llvm.masked.compressstore.* The syntax of these two intrinsics is similar to the syntax of llvm.masked.load.* and masked.store.*, respectively, but the semantics are different, matching the above patterns. %res = call <16 x float> @llvm.masked.expandload.v16f32.p0f32...
2016 Sep 25
5
RFC: New intrinsics masked.expandload and masked.compressstore
...in the future) TTI->checkAdditionalVectorizationOppotunities() - detects target specific patterns; X86 will find compress/expand and may be others TTI->vectorizeMemoryInstruction() - handle only exotic target-specific cases Pros: It will allow us to implement all X86 specific solutions. The expandload and compresssrore intrinsics may be x86 specific, polymorphic: llvm.x86.masked.expandload() llvm.x86.masked.compressstore() Cons: TTI will need to deal with Loop Info, SCEVs and other loop analysis info that it does not have today. (I do not like this way) Or we'll need to introduce TLV - Tar...
2016 Sep 26
2
RFC: New intrinsics masked.expandload and masked.compressstore
...ation. But again, I'm not sure that constructing plug-in will not be an overkill in this case. | |> TTI->vectorizeMemoryInstruction() - handle only exotic |> target-specific cases |> |> Pros: |> It will allow us to implement all X86 specific solutions. |> The expandload and compresssrore intrinsics may be x86 specific, |> polymorphic: |> llvm.x86.masked.expandload() |> llvm.x86.masked.compressstore() |> |> Cons: |> |> TTI will need to deal with Loop Info, SCEVs and other loop analysis |> info that it does not have today. (I...
2016 Jun 25
2
Question about VectorLegalizer::ExpandStore() with v4i1
...t0, t142, t32, undef:i64 t142: i32 = extract_vector_elt t128, Constant:i64<3> As you can see above SelectionDAG, if backend decides to expand vector store with v4i1, vector legalizer generates 4 store with same destination address. I think it needs to handle non-byte addressable types like ExpandLoad(). When I look at ExpandLoad(), it handles the case. If I implement new backend, I might have done custom lowering to avoid this case. But I am using x86_64 target and it generates above codes. How do you think about it? If I missed something, please let me know. Thanks, JinGu Kang
2016 Jun 28
0
Question about VectorLegalizer::ExpandStore() with v4i1
...t142: i32 = extract_vector_elt t128, Constant:i64<3> > > As you can see above SelectionDAG, if backend decides to expand vector > store with v4i1, vector legalizer generates 4 store with same > destination address. I think it needs to handle non-byte addressable > types like ExpandLoad(). When I look at ExpandLoad(), it handles the > case. If I implement new backend, I might have done custom lowering to > avoid this case. But I am using x86_64 target and it generates above > codes. How do you think about it? If I missed something, please let me > know. > > Thank...
2016 Jun 28
2
Question about VectorLegalizer::ExpandStore() with v4i1
...ct_vector_elt t128, Constant:i64<3> >> >> As you can see above SelectionDAG, if backend decides to expand vector >> store with v4i1, vector legalizer generates 4 store with same >> destination address. I think it needs to handle non-byte addressable >> types like ExpandLoad(). When I look at ExpandLoad(), it handles the >> case. If I implement new backend, I might have done custom lowering to >> avoid this case. But I am using x86_64 target and it generates above >> codes. How do you think about it? If I missed something, please let me >> know....
2019 Aug 29
2
Complex proposal v2
...t; %mask) declare void @llvm.masked.scatter.v8c64.p0v8c64(<8 x c64> %val, <8 x c64*> %ptrs, i32 %alignment, <8 x i1> %mask) llvm.masked.expandload.* - Overloaded intrinsic to expandload complex under mask (not all variants shown) declare v4c32 @llvm.masked.expandload.v4c32.p0v4c32(c32* %ptr, <4 x i1> %mask,...
2019 Oct 22
4
Complex proposal v3 + roundtable agenda
...t; %mask) declare void @llvm.masked.scatter.v8c64.p0v8c64(<8 x c64> %val, <8 x c64*> %ptrs, i32 %alignment, <8 x i1> %mask) llvm.masked.expandload.* - Overloaded intrinsic to expandload complex under mask (not all variants shown) declare v4c32 @llvm.masked.expandload.v4c32.p0v4c32(c32* %ptr, <4 x i1> %mask,...
2020 Nov 12
0
Complex proposal v3 + roundtable agenda
...lvm.masked.scatter.v8c64.p0v8c64(<8 x c64> %val, > <8 x c64*> %ptrs, > i32 %alignment, > <8 x i1> %mask) > > llvm.masked.expandload.* - Overloaded intrinsic to expandload complex under mask > (not all variants shown) > > declare v4c32 @llvm.masked.expandload.v4c32.p0v4c32(c32* %ptr, > <4 x i1> %mask, >...