Hi all, The patch below is to fix a problem with unaligned memcpys. This program: void Bork() { int Qux[33] = {0}; } will currently produce this LLVM code on PPC64: @C.0.937 = internal constant [33 x i8] zeroinitializer define void @Bork() { entry: %Qux = alloca [33 x i8] %Qux1 = bitcast [33 x i8]* %Qux to i8* call void @llvm.memcpy.i64( i8* %Qux1, i8* getelementptr ([33 x i8]* @C.0.1173, i32 0, i32 0), i64 33, i32 8 ) br label %return return: ret void } declare void @llvm.memcpy.i64(i8*, i8*, i64, i32) Notice that the alignment of the llvm.memcpy.i64 is 8 but the alignment of Qux isn't -- it's 1 (n.b. the object is placed into the redzone). The problem stems from from llvm-convert.cpp getting the memcpy statement's alignment from the source pointer. The change below will set its alignment to the minimum of the dest and src pointers' alignments. (I do a copy because I didn't want to change it for all cases. If it's okay to change it for all cases, I can take the copy out.) Is this a good thing? Will it break the world? -bw Index: gcc/llvm-convert.cpp ==================================================================--- gcc/llvm-convert.cpp (revision 43366) +++ gcc/llvm-convert.cpp (working copy) @@ -3020,8 +3020,26 @@ Emit(TREE_OPERAND(exp, 1), LV.Ptr); EmitAggregateCopy(DestLoc, LV.Ptr, TREE_TYPE(exp), isVolatile, false, Alignment); } else if (!isVolatile && TREE_CODE(TREE_OPERAND(exp, 0))!=RESULT_DECL) { - Emit(TREE_OPERAND(exp, 1), LV.Ptr); + // At this point, Alignment is the alignment of the destination + // pointer. It may not match the alignment of the source pointer. So, we + // need to make sure that it's has at least its alignment. + tree new_exp = copy_node(TREE_OPERAND(exp, 1)); + unsigned NewAlignment = expr_align(new_exp) / 8; + Alignment = (Alignment < NewAlignment) ? Alignment : NewAlignment; + TYPE_ALIGN(TREE_TYPE(new_exp)) = Alignment; + + switch (TREE_CODE(new_exp)) { + case VAR_DECL: + case PARM_DECL: + case RESULT_DECL: + DECL_ALIGN (new_exp) = Alignment * 8; + break; + default: + break; + } + + Emit(new_exp, LV.Ptr); } else { // Need to do a volatile store into TREE_OPERAND(exp, 1). To do this, we // emit it into a temporary memory location, then do a volatile copy into
On Oct 26, 2007, at 11:12 AM, Bill Wendling wrote:> Hi all, > > The patch below is to fix a problem with unaligned memcpys. This > program: > > void Bork() { > int Qux[33] = {0}; > } > > will currently produce this LLVM code on PPC64: > > @C.0.937 = internal constant [33 x i8] zeroinitializer > > define void @Bork() { > entry: > %Qux = alloca [33 x i8] > %Qux1 = bitcast [33 x i8]* %Qux to i8* > call void @llvm.memcpy.i64( i8* %Qux1, i8* getelementptr ([33 > x i8]* @C.0.1173, i32 0, i32 0), i64 33, i32 8 ) > br label %return > > return: > ret void > } > > declare void @llvm.memcpy.i64(i8*, i8*, i64, i32) > > Notice that the alignment of the llvm.memcpy.i64 is 8 but the > alignment of Qux isn't -- it's 1 (n.b. the object is placed into the > redzone). The problem stems from from llvm-convert.cpp getting the > memcpy statement's alignment from the source pointer. > The change below > will set its alignment to the minimum of the dest and src pointers' > alignments.llvm.memcpy docs says, " the caller guarantees that both the source and destination pointers are aligned to that boundary.". Would it possible to run into a situation where selecting min. of src and dest alignment will not meet this criteria ?> (I do a copy because I didn't want to change it for all > cases. If it's okay to change it for all cases, I can take the copy > out.)In this case, I *think* (not guarantee :) it is ok to change alignment of "{0}" expression node in place, because, I don't think this expr node will be shared with any another "{0}" expression. However, your conservative approach is also fine. - Devang> > Is this a good thing? Will it break the world? > > -bw > > Index: gcc/llvm-convert.cpp > ==================================================================> --- gcc/llvm-convert.cpp (revision 43366) > +++ gcc/llvm-convert.cpp (working copy) > @@ -3020,8 +3020,26 @@ > Emit(TREE_OPERAND(exp, 1), LV.Ptr); > EmitAggregateCopy(DestLoc, LV.Ptr, TREE_TYPE(exp), > isVolatile, false, > Alignment); > } else if (!isVolatile && TREE_CODE(TREE_OPERAND(exp, 0))! > =RESULT_DECL) { > - Emit(TREE_OPERAND(exp, 1), LV.Ptr); > + // At this point, Alignment is the alignment of the destination > + // pointer. It may not match the alignment of the source > pointer. So, we > + // need to make sure that it's has at least its alignment. > + tree new_exp = copy_node(TREE_OPERAND(exp, 1)); > + unsigned NewAlignment = expr_align(new_exp) / 8; > + Alignment = (Alignment < NewAlignment) ? Alignment : > NewAlignment; > + TYPE_ALIGN(TREE_TYPE(new_exp)) = Alignment; > + > + switch (TREE_CODE(new_exp)) { > + case VAR_DECL: > + case PARM_DECL: > + case RESULT_DECL: > + DECL_ALIGN (new_exp) = Alignment * 8; > + break; > + default: > + break; > + } > + > + Emit(new_exp, LV.Ptr); > } else { > // Need to do a volatile store into TREE_OPERAND(exp, 1). > To do this, we > // emit it into a temporary memory location, then do a > volatile copy into > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> llvm.memcpy docs says, " the caller guarantees that both the source > and destination pointers are aligned to that boundary.". Would it > possible to run into a situation where selecting min. of src and dest > alignment will not meet this criteria ?Since alignments are always powers of 2, the minimum of two alignments divides both of them. So no, this is not possible. However you do indeed have to be careful - just yesterday I noticed a mistake in LLVM in which it takes the minimum of an alignment and an object size to calculate a new alignment, which might be wrong if the object size is not a power of 2. Think of an 8 byte aligned pointer to an x86 long double (which has size 12 on some machines), and update the pointer so it points to the following long double. What is the new alignment? It is not 8, the minimum of 8 and 12, it is 4. I will send in a patch if I remember :) Ciao, Duncan.