Hi,
How are the access_fn declared in the IR?
Some attributes are needed in order for LICM to be able to operate (see
llvm/test/Transforms/LICM/argmemonly-call.ll )
— 
Mehdi
On Dec 26, 2016, at 9:12 PM, Diptorup Deb via llvm-dev <llvm-dev at
lists.llvm.org> wrote:> 
> Hello,
> 
> I am working on a C++ expression templates based DSL where we are using
> LLVM for the code generation. I needed some help in understanding the
> behaviour of the LICM pass. In the following example code the "A"
class
> is a custom container that defines various arithmetic operators using
> expression templates. We are defining three arrays of the "A"
container
> and aggregating the result of the multiplication into "lat".
> 
> I was attempting to get the expressions "a[i]" and
"b[j]" to be hoisted
> on top of the "j-loop" and the "k-loop" respectively.
> 
> //=== C++ code snippet ===//
> 
> 1:A<int> a[4] =
{A<int>(&ctx),A<int>(&ctx),A<int>(&ctx),A<int>(&ctx)};
> 2:A<int> b[4] =
{A<int>(&ctx),A<int>(&ctx),A<int>(&ctx),A<int>(&ctx)};
> 3:A<int> c[4] =
{A<int>(&ctx),A<int>(&ctx),A<int>(&ctx),A<int>(&ctx)};
> 5:A<int> lat(&ctx);
> 6:
> 7:for(std::size_t i = 0; i < 4; ++i)
> 8:  for(std::size_t j = 0; j < 4; ++j)
> 9:    for(std::size_t k = 0; k < 4; ++k) {
> 10:      lat = a[i] * b[j] *c[k];
> 11:    }
> 
> The IR generated for the body of the innermost loop after inlining
> most of the expression template calls and loop simplification is show
> below.
> 
> If I run LICM on this IR the GEPs in line 1,2 are hoisted into
> the preheaders of the "j-loop" and the "k-loop"
respectively. I believe
> this is so as the operands to the GEP are loop invariant and
> *isSafeToExecuteUnconditionally* returns trivially true for the GEP.
> 
> However, the CallInst Line 4,6 remain inside the innermost loop as the
> *hasLoopInvariantOperands* for the CallInsts returns false as the GEP
> operands themselves are not loop invariant.
> 
> This is the behaviour I was not sure about and would greatly appreciate
> some help in understanding it. And, for LICM to hoist the CallInsts out
> how should the code be structured.
> 
> //=== Generated IR for innermost loop body ===//
> 
> 1:  %22 = getelementptr inbounds [4 x
%"struct.mdarray_terminal"], [4 x
%"struct.mdarray_terminal"]* %a, i64 0, i64 %i.0
> 2:  %23 = getelementptr inbounds [4 x
%"struct.mdarray_terminal"], [4 x
%"struct.mdarray_terminal"]* %b, i64 0, i64 %j.0
> 3:  %24 = getelementptr inbounds [4 x
%"struct.mdarray_terminal"], [4 x
%"struct.mdarray_terminal"]* %c, i64 0, i64 %k.0
> 4:  %25 = call i32* @access_fn(%"struct.mdarray_terminal"* %22,
i64 0, i64 0)
> 5:  %26 = load i32, i32* %25, !alias.scope !1, !noalias !3
> 6:  %27 = call i32* @access_fn(%"struct.mdarray_terminal"* %23,
i64 0, i64 0)
> 7:  %28 = load i32, i32* %27, !alias.scope !5, !noalias !7
> 8:  %mkernel = call i32 @mult_op(i32 %26, i32 %28)
> 9:  %29 = call i32* @access_fn(%"struct.mdarray_terminal"* %24,
i64 0, i64 0)
> 10:  %30 = load i32, i32* %29, !alias.scope !6, !noalias !8
> 11:  %mkernel2 = call i32 @mult_op(i32 %mkernel, i32 %30)
> 12:  %31 = call i32* @access_fn(%"struct.mdarray_terminal"* %lat,
i64 0, i64 0)
> 13:  store i32 %mkernel2, i32* %31, !alias.scope !4, !noalias !9
> 14:  %32 = add i64 %k.0, 1
> 15:  br label %19
> 
> Best,
> Dipto
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev