Sebastien DELDON-GNB
2012-Apr-19 12:52 UTC
[LLVMdev] Effect on NSW attribute on 'mul' during InstCombine pass ?
Hi all, I'm using LLVM 3.0, for which I've filed following bug http://llvm.org/bugs/show_bug.cgi?id=12130. I'm trying to solve this problem by myself digging into LLVM sources. It seems that problem that I'm experiencing is related to presence or absence of NSW attribute on a 'mul'. Considering following code: define void @t2(double* %x) { L.entry: %a = alloca [2 x i64], align 4 %0 = bitcast [2 x i64]* %a to i64* store i64 3, i64* %0 %1 = getelementptr [2 x i64]* %a, i32 0, i32 1 store i64 5, i64* %1 %2 = bitcast [2 x i64]* %a to double* %3 = bitcast double* %2 to i8* %4 = load i64* %0 %5 = sub i64 %4, 2 %6 = trunc i64 %5 to i32 %7 = mul i32 %6, 8 ; HERE is problematic line #1 %8 = getelementptr i8* %3, i32 %7 %9 = bitcast i8* %8 to double* %10 = load double* %9 %11 = bitcast double* %x to i8* %12 = getelementptr i8* %11, i32 8 %13 = bitcast i8* %12 to double* store double %10, double* %13 ret void } If I use opt has follows: opt -instcombine trb.ll -S -o trb.opt.ll I've got following code generated: ; ModuleID = 'trb.ll' define void @t2(double* %x) { L.entry: %a = alloca [2 x i64], align 4 %0 = getelementptr inbounds [2 x i64]* %a, i32 0, i32 0 store i64 3, i64* %0 %1 = getelementptr [2 x i64]* %a, i32 0, i32 1 store i64 5, i64* %1 %2 = bitcast [2 x i64]* %a to i8* %3 = load i64* %0 %4 = add i64 %3, 536870910 ; Problematic line #2 %5 = trunc i64 %4 to i32 %6 = shl i32 %5, 3 %7 = getelementptr i8* %2, i32 %6 %8 = bitcast i8* %7 to double* %9 = load double* %8 %10 = bitcast double* %x to i8* %11 = getelementptr i8* %10, i32 8 %12 = bitcast i8* %11 to double* store double %9, double* %12 ret void } If I replace on problematic line #1 %7 = mul i32 %6, 8 by %7 = mul nsw i32 %6 then opt generates: ; ModuleID = 'trb.ll' define void @t2(double* %x) { L.entry: %a = alloca [2 x i64], align 4 %0 = getelementptr inbounds [2 x i64]* %a, i32 0, i32 0 store i64 3, i64* %0 %1 = getelementptr [2 x i64]* %a, i32 0, i32 1 store i64 5, i64* %1 %2 = bitcast [2 x i64]* %a to i8* %3 = load i64* %0 %4 = add i64 %3, 4294967294 %5 = trunc i64 %4 to i32 %6 = shl nsw i32 %5, 3 %7 = getelementptr i8* %2, i32 %6 %8 = bitcast i8* %7 to double* %9 = load double* %8 %10 = bitcast double* %x to i8* %11 = getelementptr i8* %10, i32 8 %12 = bitcast i8* %11 to double* store double %9, double* %12 ret void } Digging into the source I understood that 'sub' is turned into an 'add' with 2-complemented value, 'mul' is turned into a shift and shit operation has been propagated to 2-comp constant to clear highest 3 bits when nsw is not present. To me this transformation seems invalid, can someone points me to where it occurs. Problem with such a transformation is that if I specify datalayout for target then in GVN it got further optimized into: define void @t2(double* nocapture %x) nounwind { L.entry: %a = alloca [2 x i64], align 8 %0 = getelementptr inbounds [2 x i64]* %a, i32 0, i32 0 store i64 3, i64* %0, align 8 %1 = getelementptr [2 x i64]* %a, i32 0, i32 1 store i64 5, i64* %1, align 8 %2 = getelementptr [2 x i64]* %a, i32 0, i32 536870913 %3 = bitcast i64* %2 to double* %4 = getelementptr double* %x, i32 1 store double undef, double* %4, align 4 ret void } Thus marking final store as 'undef' value which if not correct if pointer arithmetic is 32-bit since 536870913*8%2^32 = 8. Thanks for your help Best Regards Seb