Hi,
How are the access_fn declared in the IR?
Some attributes are needed in order for LICM to be able to operate (see
llvm/test/Transforms/LICM/argmemonly-call.ll )
—
Mehdi
On Dec 26, 2016, at 9:12 PM, Diptorup Deb via llvm-dev <llvm-dev at
lists.llvm.org> wrote:>
> Hello,
>
> I am working on a C++ expression templates based DSL where we are using
> LLVM for the code generation. I needed some help in understanding the
> behaviour of the LICM pass. In the following example code the "A"
class
> is a custom container that defines various arithmetic operators using
> expression templates. We are defining three arrays of the "A"
container
> and aggregating the result of the multiplication into "lat".
>
> I was attempting to get the expressions "a[i]" and
"b[j]" to be hoisted
> on top of the "j-loop" and the "k-loop" respectively.
>
> //=== C++ code snippet ===//
>
> 1:A<int> a[4] =
{A<int>(&ctx),A<int>(&ctx),A<int>(&ctx),A<int>(&ctx)};
> 2:A<int> b[4] =
{A<int>(&ctx),A<int>(&ctx),A<int>(&ctx),A<int>(&ctx)};
> 3:A<int> c[4] =
{A<int>(&ctx),A<int>(&ctx),A<int>(&ctx),A<int>(&ctx)};
> 5:A<int> lat(&ctx);
> 6:
> 7:for(std::size_t i = 0; i < 4; ++i)
> 8: for(std::size_t j = 0; j < 4; ++j)
> 9: for(std::size_t k = 0; k < 4; ++k) {
> 10: lat = a[i] * b[j] *c[k];
> 11: }
>
> The IR generated for the body of the innermost loop after inlining
> most of the expression template calls and loop simplification is show
> below.
>
> If I run LICM on this IR the GEPs in line 1,2 are hoisted into
> the preheaders of the "j-loop" and the "k-loop"
respectively. I believe
> this is so as the operands to the GEP are loop invariant and
> *isSafeToExecuteUnconditionally* returns trivially true for the GEP.
>
> However, the CallInst Line 4,6 remain inside the innermost loop as the
> *hasLoopInvariantOperands* for the CallInsts returns false as the GEP
> operands themselves are not loop invariant.
>
> This is the behaviour I was not sure about and would greatly appreciate
> some help in understanding it. And, for LICM to hoist the CallInsts out
> how should the code be structured.
>
> //=== Generated IR for innermost loop body ===//
>
> 1: %22 = getelementptr inbounds [4 x
%"struct.mdarray_terminal"], [4 x
%"struct.mdarray_terminal"]* %a, i64 0, i64 %i.0
> 2: %23 = getelementptr inbounds [4 x
%"struct.mdarray_terminal"], [4 x
%"struct.mdarray_terminal"]* %b, i64 0, i64 %j.0
> 3: %24 = getelementptr inbounds [4 x
%"struct.mdarray_terminal"], [4 x
%"struct.mdarray_terminal"]* %c, i64 0, i64 %k.0
> 4: %25 = call i32* @access_fn(%"struct.mdarray_terminal"* %22,
i64 0, i64 0)
> 5: %26 = load i32, i32* %25, !alias.scope !1, !noalias !3
> 6: %27 = call i32* @access_fn(%"struct.mdarray_terminal"* %23,
i64 0, i64 0)
> 7: %28 = load i32, i32* %27, !alias.scope !5, !noalias !7
> 8: %mkernel = call i32 @mult_op(i32 %26, i32 %28)
> 9: %29 = call i32* @access_fn(%"struct.mdarray_terminal"* %24,
i64 0, i64 0)
> 10: %30 = load i32, i32* %29, !alias.scope !6, !noalias !8
> 11: %mkernel2 = call i32 @mult_op(i32 %mkernel, i32 %30)
> 12: %31 = call i32* @access_fn(%"struct.mdarray_terminal"* %lat,
i64 0, i64 0)
> 13: store i32 %mkernel2, i32* %31, !alias.scope !4, !noalias !9
> 14: %32 = add i64 %k.0, 1
> 15: br label %19
>
> Best,
> Dipto
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev