Hello! I have the following simple IR: =================================@l = common global i64 0, align 8 define void @hello() nounwind { entry: store i64 -4919131755279862989, i64* @l ret void } define i32 @main(i32 %argc, i8** %argv) nounwind { entry: call void @hello() %tmp = load i64* @l %conv = trunc i64 %tmp to i32 ret i32 %conv } ================================= Of interest are the lines %tmp = load i64* @l %conv = trunc i64 %tmp to i32 ... which LLVM automatically translates to "load singleword" from memory, instead of "load doubleword; then truncate". However, this (simply load a singleword) gives an erroneous result. On my architecture, this results in the high bits being loaded into the return register, instead of the low bits, as should happen with truncate. Details: i64 's are stored in two adjacent 32 bit registers. So the store happens like this: "stddw 0xBBBBBBBB33333333, *$ptr" and the load should happen like this: "lddw *$ptr, A5:A4" . It is easy to see that if "ldw *$ptr, A4" is printed, then the high bits will be loaded into A4. Something like this would be correct just like the lddw variant: "ldw *-$ptr(4), A4" Is there a way to change this behavior (that (trunc (load doubleword)) is replaced by (load word))? Or if you think the fault lies on my side, where is the right point to begin debugging? Thanks for any kind of hint! Johannes
Hi Johannes, what processor are you targeting? Is it little-endian or big-endian? Ciao, Duncan.> I have the following simple IR: > > =================================> @l = common global i64 0, align 8 > > define void @hello() nounwind { > entry: > store i64 -4919131755279862989, i64* @l > ret void > } > > define i32 @main(i32 %argc, i8** %argv) nounwind { > entry: > call void @hello() > %tmp = load i64* @l > %conv = trunc i64 %tmp to i32 > ret i32 %conv > } > =================================> > Of interest are the lines > %tmp = load i64* @l > %conv = trunc i64 %tmp to i32 > > ... which LLVM automatically translates to "load singleword" from > memory, instead of "load doubleword; then truncate". However, this > (simply load a singleword) gives an erroneous result. On my > architecture, this results in the high bits being loaded into the return > register, instead of the low bits, as should happen with truncate. > > Details: i64 's are stored in two adjacent 32 bit registers. So the > store happens like this: > "stddw 0xBBBBBBBB33333333, *$ptr" > and the load should happen like this: > "lddw *$ptr, A5:A4" > . It is easy to see that if > "ldw *$ptr, A4" > is printed, then the high bits will be loaded into A4. Something like > this would be correct just like the lddw variant: > "ldw *-$ptr(4), A4" > > Is there a way to change this behavior (that (trunc (load doubleword)) > is replaced by (load word))? > Or if you think the fault lies on my side, where is the right point to > begin debugging? > > Thanks for any kind of hint! > Johannes > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> Hi Johannes, what processor are you targeting? Is it little-endian or > big-endian?Little-endian. (The truth: you can set it manually, but it is set to little endian, for sure.) The processor is a TI TMS320C64x. Follow-up: I discovered that the "guilty" method is DAGCombiner::ReduceLoadWidth. The error is introduced because the offset is not calculated correctly. The first problem is that the pointer I get for loading does not point to the address of the low word, but to the address of the high word. The second problem is that this is apparently correct as long as lddw is used instead of ldw. Do you have any ideas on this? (The third problem is that the creation of the pointer is not my doing. I'm just extending our backend to support i64 additionally (instead of just i32 and smaller). Doing this turns out to be trickier than expected.) Cheers, Johannes