I've run into a situation in isel where it seems like the selector is
generating a cycle in the DAG.
I have something like this:
0x215f140: v2f64 = llvm.x86.sse2.min.sd 0x215efd0, 0x21606d0, 0x215eb80
[0] 0x215efd0: i64 = Constant <647>
[0] 0x21606d0: v2f64 = scalar_to_vector 0x213b8f0
[0] 0x213b8f0: f64,ch = load 0x213b780, 0x213aa90, 0x213b610
<0x2113690:0>
alignment=8
[0] 0x213b780: ch = Prefetch 0x213aee0:1, 0x213b1c0, 0x213b330, 0x213b4a0
[1] 0x213aee0: f64,ch = load 0x213a720, 0x213ac00, 0x213b610
<0x215ace8:0> alignment=8
[0] 0x213a720: ch = EntryToken
[0] 0x213ac00: i64,ch = CopyFromReg 0x213a720, 0x213ad70
[0] 0x213a720: ch = EntryToken
[0] 0x213ad70: i64 = Register #1024
[0] 0x213b610: i64 = undef
[0] 0x213b1c0: i64 = add 0x213ac00, 0x213b050
[0] 0x213ac00: i64,ch = CopyFromReg 0x213a720, 0x213ad70
[0] 0x213a720: ch = EntryToken
[0] 0x213ad70: i64 = Register #1024
[0] 0x213b050: i64 = Constant <136>
[0] 0x213b330: i32 = Constant <0>
[0] 0x213b4a0: i32 = Constant <3>
[0] 0x213aa90: i64 = Constant <0>
[0] 0x213b610: i64 = undef
[0] 0x215eb80: v2f64 = scalar_to_vector 0x213aee0
[0] 0x213aee0: f64,ch = load 0x213a720, 0x213ac00, 0x213b610
<0x215ace8:0>
alignment=8
[0] 0x213a720: ch = EntryToken
[0] 0x213ac00: i64,ch = CopyFromReg 0x213a720, 0x213ad70
[0] 0x213a720: ch = EntryToken
[0] 0x213ad70: i64 = Register #1024
[0] 0x213b610: i64 = undef
I added the brackets on output to show which result is being linked to, so the
prefetch takes the chain output from its controlling load.
When llvm.x86.sse2.min.sd gets selected, the TableGen-generated code
does this:
SDNode *ResNode = CurDAG->SelectNodeTo(N.getNode(), Opc0, VT0, MVT::Other,
Ops0, 6);
0x215f140: v2f64,ch = <<Unknown Machine Node>> 0x21606d0, 0x213ac00,
0x215ffa0, 0x215ea10, 0x215ee60, 0x213a720
[0] 0x21606d0: v2f64 = scalar_to_vector 0x213b8f0
[0] 0x213b8f0: f64,ch = load 0x213b780, 0x213aa90, 0x213b610
<0x2113690:0>
alignment=8
[0] 0x213b780: ch = Prefetch 0x213aee0:1, 0x213b1c0, 0x213b330, 0x213b4a0
[1] 0x213aee0: f64,ch = load 0x213a720, 0x213ac00, 0x213b610
<0x215ace8:0> alignment=8
[0] 0x213a720: ch = EntryToken
[0] 0x213ac00: i64,ch = CopyFromReg 0x213a720, 0x213ad70
[0] 0x213a720: ch = EntryToken
[0] 0x213ad70: i64 = Register #1024
[0] 0x213b610: i64 = undef
[0] 0x213b1c0: i64 = add 0x213ac00, 0x213b050
[0] 0x213ac00: i64,ch = CopyFromReg 0x213a720, 0x213ad70
[0] 0x213a720: ch = EntryToken
[0] 0x213ad70: i64 = Register #1024
[0] 0x213b050: i64 = Constant <136>
[0] 0x213b330: i32 = Constant <0>
[0] 0x213b4a0: i32 = Constant <3>
[0] 0x213aa90: i64 = Constant <0>
[0] 0x213b610: i64 = undef
[0] 0x213ac00: i64,ch = CopyFromReg 0x213a720, 0x213ad70
[0] 0x213a720: ch = EntryToken
[0] 0x213ad70: i64 = Register #1024
[0] 0x215ffa0: i8 = TargetConstant <1>
[0] 0x215ea10: i64 = Register #0
[0] 0x215ee60: i32 = TargetConstant <0>
[0] 0x213a720: ch = EntryToken
Ok, so far so good.
Then it does this:
ReplaceUses(SDValue(CPInChain2.getNode(), 1), SDValue(ResNode, 1));
CPInChain2 is this:
0x213aee0: f64,ch = load 0x213a720, 0x213ac00, 0x213b610 <0x215ace8:0>
alignment=8
[0] 0x213a720: ch = EntryToken
[0] 0x213ac00: i64,ch = CopyFromReg 0x213a720, 0x213ad70
[0] 0x213a720: ch = EntryToken
[0] 0x213ad70: i64 = Register #1024
[0] 0x213b610: i64 = undef
So it's replacing the load/memory chain output of the min intrinsic with the
chain output of the min intrinsic. Ok, that seems right. But one of the
users of that chain output is the prefetch. So now the prefetch will get a
chain input pointing to the chain output of the min intrinsic, creating a
cycle.
The fundamental issue is that the DAG originally looked like this:
MIN
LOAD B
PREFETCH
Chain from LOAD A
LOAD A
Now the chain will come from the MIN and we'll have:
MIN
LOAD B
PREFETCH
Chain from MIN
X86AddressOperands
Is such a thing legal? I wouldn't expect so since it's cycle.
Eventually we
blow up in topological sort since it can't find one.
What's the right way to handle this?
-Dave