Kalle.Raiskila at nokia.com
2010-May-31 13:47 UTC
[LLVMdev] Error with instruction selection
Hello,
I'm seeing a miscompilation from the following code:
declare <4 x float>* @getPtr()
define <4 x float> @func() {
%rv1 = call <4 x float>* @getPtr()
%rv2 = call <4 x float>* @getPtr()
%rv3 = load <4 x float>* %rv1
ret <4 x float> %rv3
}
The load ends up loading from the pointer returned by the 2nd function
call.
I traced down the problem to calling the
SelectionDAGISel::SelectCodeCommon on the load instruction. Before
calling that the DAG looks OK. The selected target load ends up pointing
to the "physical function call return register node" instead of the
CopyFromReg node that copies the result of the 1st call to a temporary
register. The physical return register is then overwritten in the next
call. (This is visible when calling "llc -view-isel-dags
-view-sched-dags". The first graph is OK, the second is not.)
The problem goes away if I:
-have the getPtr return anything else than <4xfloat>* or <4xi32>*
(e.g.
<4xfloat> or float* work just fine)
-do not load from or store to the pointer - e.g. just returning the
pointer works.
-target any other processor than CellSPU (ok, some backends assert on
this code, and the PIC assebly I didn't understand :) )
Any explanation on what is going on or hints on how to fix this are
highly appreciated!
thanks,
kalle
Kalle.Raiskila at nokia.com
2010-Jun-01 13:41 UTC
[LLVMdev] Error with instruction selection
On Mon, 2010-05-31 at 15:47 +0200, Raiskila Kalle wrote:> Hello, > I'm seeing a miscompilation from the following code: > > declare <4 x float>* @getPtr() > define <4 x float> @func() { > %rv1 = call <4 x float>* @getPtr() > %rv2 = call <4 x float>* @getPtr() > %rv3 = load <4 x float>* %rv1 > ret <4 x float> %rv3 > }Never mind, fixed in 105269. Turns out that the selecting of the load did some premature optimization. kalle
Maybe Matching Threads
- [LLVMdev] Finding Merge nodes in CFG (ambika@cse.iitb.ac.in)
- issue with using rm: cannot generate on-the-fly list
- [LLVMdev] Patch - Allow calls that return i8 or i16. On SPU.
- [03/15][PATCH] kvm/ia64: Add header files for kvm/ia64. V8
- [03/15][PATCH] kvm/ia64: Add header files for kvm/ia64. V8