On Friday 26 February 2010 10:34:41 David Greene wrote:> On Friday 26 February 2010 09:55:32 David Greene wrote: > > In the continuing quest to try to track down problems we're seeing in > > SelectionDAG, I added the following assert > > toSelectionDAG::ReplaceAllUsesOfValuesWith: > > Here's a patch to add more of these deleted node asserts. They fire > tons of times in the testbase.Ping? Just want to make sure this didn't get missed somehow. I'm surprised to see no discussion. -Dave
On Feb 26, 2010, at 2:07 PM, David Greene wrote:> On Friday 26 February 2010 10:34:41 David Greene wrote: >> On Friday 26 February 2010 09:55:32 David Greene wrote: >>> In the continuing quest to try to track down problems we're seeing >>> in >>> SelectionDAG, I added the following assert >>> toSelectionDAG::ReplaceAllUsesOfValuesWith: >> >> Here's a patch to add more of these deleted node asserts. They fire >> tons of times in the testbase. > > Ping? Just want to make sure this didn't get missed somehow. I'm > surprised to see no discussion.I've now looked at your latest patch. In summary, it does expose a subtle problem. I haven't seen anything that here would lead to observable misbehavior yet though. X86GenDAGISel.inc has code like this: SDValue N1 = N->getOperand(1); ... SDNode *ResNode = CurDAG->SelectNodeTo(N, ...); ... ReplaceUses(SDValue(N1.getNode(), 1), SDValue(ResNode, 2)); If N was the only user of N1, and N1 isn't in the new operand list, then the SelectNodeTo call will make N1 dead. SelectNodeTo will automatically delete nodes that are made dead by the transformation. This means that the "from" node in the subsequent ReplaceUses call will be a deleted node in that case. This doesn't break in practice because deleted node aren't actually deallocated, they're just put on a linked list of nodes ready to be reused next time a new node is created, and no new nodes are created between the SelectNodeTo call and the ReplaceUses call. Another observation here is that the ReplaceUses call in such cases doesn't do anything, because the From node has no uses. Perhaps this can be fixed by making the code skip the ReplaceUses call in the case where there are no uses to replace. That's not trivial to detect though. Dan
On Friday 26 February 2010 19:09:01 Dan Gohman wrote:> I've now looked at your latest patch. In summary, it does expose a > subtle problem. I haven't seen anything that here would lead to > observable misbehavior yet though.Well, I'm definitely observing misbehavior. I know it has something to do with local changes here but I haven't isolated it yet.> X86GenDAGISel.inc has code like this: > > SDValue N1 = N->getOperand(1); > ... > SDNode *ResNode = CurDAG->SelectNodeTo(N, ...); > ... > ReplaceUses(SDValue(N1.getNode(), 1), SDValue(ResNode, 2)); > > If N was the only user of N1, and N1 isn't in the new operand list, then > the SelectNodeTo call will make N1 dead. SelectNodeTo will automatically > delete nodes that are made dead by the transformation. This means that > the "from" node in the subsequent ReplaceUses call will be a deleted > node > in that case.Ok, so that should be easy to fix.> This doesn't break in practice because deleted node aren't actually > deallocated, they're just put on a linked list of nodes ready to be > reused next time a new node is created, and no new nodes are created > between the SelectNodeTo call and the ReplaceUses call.Right.> Another observation here is that the ReplaceUses call in such cases > doesn't do anything, because the From node has no uses.Right.> Perhaps this can be fixed by making the code skip the ReplaceUses > call in the case where there are no uses to replace. That's not trivial > to detect though.Why not just check the same thing the added asserts check? What I'm seeing is a problem in ReplaceAllUsesOf itself. It recurses down and eventually replaces the node under the iterator in this use loop: SDNode::use_iterator UI = From.getNode()->use_begin(), UE = From.getNode()->use_end(); while (UI != UE) { SDNode *User = *UI; bool UserRemovedFromCSEMaps = false; UI goes bad and we blow up after returning from a deeply recursed call. It's simply not safe to iterate over a set that may change. Unfortunately, any of the nodes under the iterators may change so I don't see an easy way to fix this. -Dave