Scott Pillow
2013-Feb-07 02:26 UTC
[LLVMdev] alloca scalarization with dynamic indexing into vectors
Hi all, I have a question regarding dynamic indexing into a vector with GEP. I see that in the ScalarReplAggregates pass in the LLVM 3.2 release the call SROA::isSafeGEP() will now allow alloca scalarization in the case where a GEP index into a vector isn’t a constant. My question is: what is the expected behavior when the index is out of bounds of the vector? Is it undefined? I have an example .ll where we have an alloca that can potentially be scalarized where the index into the vector is a function argument and could be set to any value. (scalar_repl_store_delete.ll): target datalayout "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-f80:128:128-v16:16:16-v24:32:32-v32:32:32-v48:64:64-v64:64:64-v96:128:128-v128:128:128-v192:256:256-v256:256:256-v512:512:512-v1024:1024:1024--a64:64:64-f80:128:128-n8:16:32:64" define void @test_fn(<2 x i32>* %src, <2 x i32>* %results, i32 %alignmentOffsets) nounwind alwaysinline { entry: %sPrivateStorage = alloca [3 x <2 x i32>], align 8 %0 = load <2 x i32>* %src, align 8, !tbaa !9 %arrayidx1 = getelementptr inbounds [3 x <2 x i32>]* %sPrivateStorage, i64 0, i64 0 store <2 x i32> %0, <2 x i32>* %arrayidx1, align 8, !tbaa !9 %arrayidx2 = getelementptr inbounds <2 x i32>* %src, i64 1 %1 = load <2 x i32>* %arrayidx2, align 8, !tbaa !9 %arrayidx3 = getelementptr inbounds [3 x <2 x i32>]* %sPrivateStorage, i64 0, i64 1 store <2 x i32> %1, <2 x i32>* %arrayidx3, align 8, !tbaa !9 %arrayidx4 = getelementptr inbounds <2 x i32>* %src, i64 2 %2 = load <2 x i32>* %arrayidx4, align 8, !tbaa !9 %arrayidx5 = getelementptr inbounds [3 x <2 x i32>]* %sPrivateStorage, i64 0, i64 2 store <2 x i32> %2, <2 x i32>* %arrayidx5, align 8, !tbaa !9 %idx.ext = zext i32 %alignmentOffsets to i64 %add.ptr = getelementptr inbounds [3 x <2 x i32>]* %sPrivateStorage, i64 0, i64 0, i64 %idx.ext %3 = load i32* %add.ptr, align 4, !tbaa !11 %4 = insertelement <2 x i32> undef, i32 %3, i32 0 %splat = shufflevector <2 x i32> %4, <2 x i32> undef, <2 x i32> zeroinitializer store <2 x i32> %splat, <2 x i32>* %results, align 8, !tbaa !9 ret void } !9 = metadata !{metadata !"omnipotent char", metadata !10} !10 = metadata !{metadata !"Simple C/C++ TBAA", null} !11 = metadata !{metadata !"int", metadata !9} In this example, the sequence of stores is copying the data from %src into %sPrivateStorage with the GEP of interest being: %add.ptr = getelementptr inbounds [3 x <2 x i32>]* %sPrivateStorage, i64 0, i64 0, i64 %idx.ext After running the line: opt.exe -scalarrepl scalar_repl_store_delete.ll -o=scalar_repl_store_delete_after.bc We get: (scalar_repl_store_delete_after.ll): ; ModuleID = 'scalar_repl_store_delete_after.bc' target datalayout "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-f80:128:128-v16:16:16-v24:32:32-v32:32:32-v48:64:64-v64:64:64-v96:128:128-v128:128:128-v192:256:256-v256:256:256-v512:512:512-v1024:1024:1024--a64:64:64-f80:128:128-n8:16:32:64" define void @test_fn(<2 x i32>* %src, <2 x i32>* %results, i32 %alignmentOffsets) nounwind alwaysinline { entry: %sPrivateStorage.0 = alloca <2 x i32>, align 8 %0 = load <2 x i32>* %src, align 8, !tbaa !0 store <2 x i32> %0, <2 x i32>* %sPrivateStorage.0, align 8, !tbaa !0 %arrayidx2 = getelementptr inbounds <2 x i32>* %src, i64 1 %1 = load <2 x i32>* %arrayidx2, align 8, !tbaa !0 %arrayidx4 = getelementptr inbounds <2 x i32>* %src, i64 2 %2 = load <2 x i32>* %arrayidx4, align 8, !tbaa !0 %idx.ext = zext i32 %alignmentOffsets to i64 %add.ptr = getelementptr inbounds <2 x i32>* %sPrivateStorage.0, i32 0, i64 %idx.ext %3 = load i32* %add.ptr, align 4, !tbaa !2 %4 = insertelement <2 x i32> undef, i32 %3, i32 0 %splat = shufflevector <2 x i32> %4, <2 x i32> undef, <2 x i32> zeroinitializer store <2 x i32> %splat, <2 x i32>* %results, align 8, !tbaa !0 ret void } !0 = metadata !{metadata !"omnipotent char", metadata !1} !1 = metadata !{metadata !"Simple C/C++ TBAA", null} !2 = metadata !{metadata !"int", metadata !0} The second two stores are deleted because they appear to be dead even though that data can actually be reached by the out of bounds vector index in the GEP. What is expected in this case? Thanks, Scott -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130206/3b09cacb/attachment.html>
Duncan Sands
2013-Feb-07 10:39 UTC
[LLVMdev] alloca scalarization with dynamic indexing into vectors
Hi Scott, this seems like a SROA bug to me, please open a bug report. Ciao, Duncan. On 07/02/13 03:26, Scott Pillow wrote:> Hi all, > > I have a question regarding dynamic indexing into a vector with GEP. I see that > in the ScalarReplAggregates pass in the LLVM 3.2 release the call > SROA::isSafeGEP() will now allow alloca scalarization in the case where a GEP > index into a vector isn’t a constant. My question is: what is the expected > behavior when the index is out of bounds of the vector? Is it undefined? I > have an example .ll where we have an alloca that can potentially be scalarized > where the index into the vector is a function argument and could be set to any > value. > > (scalar_repl_store_delete.ll): > > target datalayout > "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-f80:128:128-v16:16:16-v24:32:32-v32:32:32-v48:64:64-v64:64:64-v96:128:128-v128:128:128-v192:256:256-v256:256:256-v512:512:512-v1024:1024:1024--a64:64:64-f80:128:128-n8:16:32:64" > > define void @test_fn(<2 x i32>* %src, <2 x i32>* %results, i32 > %alignmentOffsets) nounwind alwaysinline { > > entry: > > %sPrivateStorage = alloca [3 x <2 x i32>], align 8 > > %0 = load <2 x i32>* %src, align 8, !tbaa !9 > > %arrayidx1 = getelementptr inbounds [3 x <2 x i32>]* %sPrivateStorage, i64 0, > i64 0 > > store <2 x i32> %0, <2 x i32>* %arrayidx1, align 8, !tbaa !9 > > %arrayidx2 = getelementptr inbounds <2 x i32>* %src, i64 1 > > %1 = load <2 x i32>* %arrayidx2, align 8, !tbaa !9 > > %arrayidx3 = getelementptr inbounds [3 x <2 x i32>]* %sPrivateStorage, i64 0, > i64 1 > > store <2 x i32> %1, <2 x i32>* %arrayidx3, align 8, !tbaa !9 > > %arrayidx4 = getelementptr inbounds <2 x i32>* %src, i64 2 > > %2 = load <2 x i32>* %arrayidx4, align 8, !tbaa !9 > > %arrayidx5 = getelementptr inbounds [3 x <2 x i32>]* %sPrivateStorage, i64 0, > i64 2 > > store <2 x i32> %2, <2 x i32>* %arrayidx5, align 8, !tbaa !9 > > %idx.ext = zext i32 %alignmentOffsets to i64 > > %add.ptr = getelementptr inbounds [3 x <2 x i32>]* %sPrivateStorage, i64 0, > i64 0, i64 %idx.ext > > %3 = load i32* %add.ptr, align 4, !tbaa !11 > > %4 = insertelement <2 x i32> undef, i32 %3, i32 0 > > %splat = shufflevector <2 x i32> %4, <2 x i32> undef, <2 x i32> zeroinitializer > > store <2 x i32> %splat, <2 x i32>* %results, align 8, !tbaa !9 > > ret void > > } > > !9 = metadata !{metadata !"omnipotent char", metadata !10} > > !10 = metadata !{metadata !"Simple C/C++ TBAA", null} > > !11 = metadata !{metadata !"int", metadata !9} > > In this example, the sequence of stores is copying the data from %src into > %sPrivateStorage with the GEP of interest being: > > %add.ptr = getelementptr inbounds [3 x <2 x i32>]* %sPrivateStorage, i64 0, > i64 0, i64 %idx.ext > > After running the line: > > opt.exe -scalarrepl scalar_repl_store_delete.ll -o=scalar_repl_store_delete_after.bc > > We get: > > (scalar_repl_store_delete_after.ll): > > ; ModuleID = 'scalar_repl_store_delete_after.bc' > > target datalayout > "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-f80:128:128-v16:16:16-v24:32:32-v32:32:32-v48:64:64-v64:64:64-v96:128:128-v128:128:128-v192:256:256-v256:256:256-v512:512:512-v1024:1024:1024--a64:64:64-f80:128:128-n8:16:32:64" > > define void @test_fn(<2 x i32>* %src, <2 x i32>* %results, i32 > %alignmentOffsets) nounwind alwaysinline { > > entry: > > %sPrivateStorage.0 = alloca <2 x i32>, align 8 > > %0 = load <2 x i32>* %src, align 8, !tbaa !0 > > store <2 x i32> %0, <2 x i32>* %sPrivateStorage.0, align 8, !tbaa !0 > > %arrayidx2 = getelementptr inbounds <2 x i32>* %src, i64 1 > > %1 = load <2 x i32>* %arrayidx2, align 8, !tbaa !0 > > %arrayidx4 = getelementptr inbounds <2 x i32>* %src, i64 2 > > %2 = load <2 x i32>* %arrayidx4, align 8, !tbaa !0 > > %idx.ext = zext i32 %alignmentOffsets to i64 > > %add.ptr = getelementptr inbounds <2 x i32>* %sPrivateStorage.0, i32 0, i64 > %idx.ext > > %3 = load i32* %add.ptr, align 4, !tbaa !2 > > %4 = insertelement <2 x i32> undef, i32 %3, i32 0 > > %splat = shufflevector <2 x i32> %4, <2 x i32> undef, <2 x i32> zeroinitializer > > store <2 x i32> %splat, <2 x i32>* %results, align 8, !tbaa !0 > > ret void > > } > > !0 = metadata !{metadata !"omnipotent char", metadata !1} > > !1 = metadata !{metadata !"Simple C/C++ TBAA", null} > > !2 = metadata !{metadata !"int", metadata !0} > > The second two stores are deleted because they appear to be dead even though > that data can actually be reached by the out of bounds vector index in the GEP. > What is expected in this case? > > Thanks, > > Scott > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >