Renlin Li via llvm-dev
2018-Aug-14 14:54 UTC
[llvm-dev] Extend TruncInstCombine class in Aggressive Instruction Combine pass to handle users in a different truncation DAG
Hello there, Multiple users inside the same DAG has been handled by the new pass and implemented in TruncInstCombine class. But I wonder if it possible to extend TruncInstCombine to handle cases where the DAG ending nodes (ZExt/SExt) are used in a different truncate DAG. For example, a simple case like this: define void @test(i8* noalias nocapture readonly %src0_ptr, i8* noalias nocapture readonly %src1_ptr, i16* noalias nocapture %dst_ptr) { %1 = load i8, i8* %src1_ptr, align 1 %2 = load i8, i8* %src0_ptr, align 1 %3 = zext i8 %2 to i32 %4 = zext i8 %1 to i32 %5 = mul nuw nsw i32 %3, %4 %6 = trunc i32 %5 to i16 %7 = getelementptr inbounds i8, i8* %src0_ptr, i64 1 %8 = load i8, i8* %7, align 1 %9 = zext i8 %8 to i32 %10 = mul nuw nsw i32 %9, %4 %11 = trunc i32 %10 to i16 store i16 %6, i16* %dst_ptr, align 2 %12 = getelementptr inbounds i16, i16* %dst_ptr, i64 1 store i16 %11, i16* %12, align 2 ret void } There are more complicated cases where there is a chain of such dependency. I have an initial idea to do this, it requires minimum change to current TruncInstCombine class. 1, Record TruncInst which could be shrinked, but blocked by multiple-use values. In the end, a list of such pairs are collected. 2, If the list is not empty, try to merge TruncInst DAG to reduce the number of values used outside of the DAG. 2,1 If all user of the value are inside a shrinkable DAG, merge the DAGs to eliminate this dependency. This will update the dependency of the new DAG, keep doing it until there is no value users outside of the DAG. The new big DAG which contains multiple TruncInst could be shrinked together. Apply the transformation, goto step 3 2,3 Otherwise, goto step 3. 3, remove related TruncInsts from list. goto step 2. Probably, a cost function could be added to balance the number of shrinkable truncate instructions against the number of copies need to make. And by the way, extension might be free if it could be combine with the users in some architecture. I didn't feel very comfortable with my approach, is there any suggestion how this could be done better? Thanks, Renlin