Sean Fertile via llvm-dev
2017-Oct-13 23:05 UTC
[llvm-dev] [SelectionDAG] Assertion due to MachineMemOperand flags difference.
Hello, I've hit an assertion in SelectionDAG where we try to merge 2 loads that have the same operands but their MMO flags differ. One is dereferenceable and one is not. I'm not sure what the underlying issue here is: 1) MDSDNode with the same operands should have the same flags set on their respective MMO. The fact the flags differ when the opcode,types,operands and address-space are the same is the problem. 2) Having MDSDnodes with the same operands but different MMO flags is possible, so the Flags should be added to the FoldingSetNodeID. 3) Something else I haven't considered. I have a patch posted implementing 2, but don't know if I should look at fixing 1 as well (or perhaps instead). The loads that trigger the assertion are: t47: v4i32,ch = load<LD16[%0+80](align=8)(dereferenceable)> t20, t46, undef:i64 t69: v4i32,ch = load<LD16[FixedStack1+80](align=8)> t50, t46, undef:i64 I would expect the the second load should also be marked dereferenceable since its loading from one of the TargetFrames. Am I on the right track here? Thanks Sean -------------- next part -------------- Initial selection DAG: BB#0 '_Z3fn2v:entry' SelectionDAG has 122 nodes: t4: i64 = GlobalAddress<void (%class.F*)* @_Z10EmitLValuev> 0 t10: i64 = add Register:i64 %X1, Constant:i64<32> t0: ch = EntryToken t3: ch = lifetime.start t0, TargetFrameIndex:i64<1> t7: ch,glue = callseq_start t3, TargetConstant:i64<32>, TargetConstant:i64<0> t12: ch,glue = CopyToReg t7, Register:i64 %X3, FrameIndex:i64<1> t16: ch,glue = PPCISD::CALL_NOP t12, TargetGlobalAddress:i64<void (%class.F*)* @_Z10EmitLValuev> 0, Register:i64 %X3, Register:i64 %X2, RegisterMask:Untyped, t12:1 t17: ch,glue = callseq_end t16, TargetConstant:i64<32>, TargetConstant:i64<0>, t16:1 t20: ch = lifetime.start t17, TargetFrameIndex:i64<0> t21: i64 = Constant<96> t22: i64 = Constant<0> t24: v4i32,ch = load<LD16[%0](align=8)(dereferenceable)> t20, FrameIndex:i64<1>, undef:i64 t27: i64 = add FrameIndex:i64<1>, Constant:i64<16> t28: v4i32,ch = load<LD16[%0+16](align=8)(dereferenceable)> t20, t27, undef:i64 t31: i64 = add FrameIndex:i64<1>, Constant:i64<32> t32: v4i32,ch = load<LD16[%0+32](align=8)(dereferenceable)> t20, t31, undef:i64 t36: i64 = add FrameIndex:i64<1>, Constant:i64<48> t37: v4i32,ch = load<LD16[%0+48](align=8)(dereferenceable)> t20, t36, undef:i64 t41: i64 = add FrameIndex:i64<1>, Constant:i64<64> t42: v4i32,ch = load<LD16[%0+64](align=8)(dereferenceable)> t20, t41, undef:i64 t46: i64 = add FrameIndex:i64<1>, Constant:i64<80> t47: v4i32,ch = load<LD16[%0+80](align=8)(dereferenceable)> t20, t46, undef:i64 t25: ch = store<ST16[%1]> t20, t24, FrameIndex:i64<0>, undef:i64 t29: i64 = add FrameIndex:i64<0>, Constant:i64<16> t30: ch = store<ST16[%1+16]> t20, t28, t29, undef:i64 t33: i64 = add FrameIndex:i64<0>, Constant:i64<32> t34: ch = store<ST16[%1+32]> t20, t32, t33, undef:i64 t38: i64 = add FrameIndex:i64<0>, Constant:i64<48> t39: ch = store<ST16[%1+48]> t20, t37, t38, undef:i64 t43: i64 = add FrameIndex:i64<0>, Constant:i64<64> t44: ch = store<ST16[%1+64]> t20, t42, t43, undef:i64 t48: i64 = add FrameIndex:i64<0>, Constant:i64<80> t49: ch = store<ST16[%1+80]> t20, t47, t48, undef:i64 t50: ch = TokenFactor t24:1, t25, t28:1, t30, t32:1, t34, t37:1, t39, t42:1, t44, t47:1, t49 t51: i64 = GlobalAddress<void (%class.F*)* @_Z3fn11F> 0 t53: ch,glue = callseq_start t50, TargetConstant:i64<128>, TargetConstant:i64<0> t54: i32 = Constant<96> t55: v4i32,ch = load<LD16[FixedStack1](align=8)> t50, FrameIndex:i64<1>, undef:i64 t57: v4i32,ch = load<LD16[FixedStack1+16](align=8)> t50, t27, undef:i64 t60: v4i32,ch = load<LD16[FixedStack1+32](align=8)> t50, t31, undef:i64 t63: v4i32,ch = load<LD16[FixedStack1+48](align=8)> t50, t36, undef:i64 t66: v4i32,ch = load<LD16[FixedStack1+64](align=8)> t50, t41, undef:i64 t69: v4i32,ch = load<LD16[FixedStack1+80](align=8)> t50, t46, undef:i64 t56: ch = store<ST16[<unknown>](align=8)> t50, t55, t10, undef:i64 t58: i64 = add t10, Constant:i64<16> t59: ch = store<ST16[<unknown>](align=8)> t50, t57, t58, undef:i64 t61: i64 = add t10, Constant:i64<32> t62: ch = store<ST16[<unknown>](align=8)> t50, t60, t61, undef:i64 t64: i64 = add t10, Constant:i64<48> t65: ch = store<ST16[<unknown>](align=8)> t50, t63, t64, undef:i64 t67: i64 = add t10, Constant:i64<64> t68: ch = store<ST16[<unknown>](align=8)> t50, t66, t67, undef:i64 t70: i64 = add t10, Constant:i64<80> t71: ch = store<ST16[<unknown>](align=8)> t50, t69, t70, undef:i64 t72: ch = TokenFactor t55:1, t56, t57:1, t59, t60:1, t62, t63:1, t65, t66:1, t68, t69:1, t71 t73: ch,glue = callseq_start t72, TargetConstant:i64<128>, TargetConstant:i64<0> t74: i64,ch = load<LD8[FixedStack1]> t73, FrameIndex:i64<1>, undef:i64 t76: i64 = add FrameIndex:i64<1>, Constant:i64<8> t77: i64,ch = load<LD8[FixedStack1+8]> t73, t76, undef:i64 t78: i64,ch = load<LD8[FixedStack1+16]> t73, t27, undef:i64 t80: i64 = add FrameIndex:i64<1>, Constant:i64<24> t81: i64,ch = load<LD8[FixedStack1+24]> t73, t80, undef:i64 t82: i64,ch = load<LD8[FixedStack1+32]> t73, t31, undef:i64 t84: i64 = add FrameIndex:i64<1>, Constant:i64<40> t85: i64,ch = load<LD8[FixedStack1+40]> t73, t84, undef:i64 t86: i64,ch = load<LD8[FixedStack1+48]> t73, t36, undef:i64 t88: i64 = add FrameIndex:i64<1>, Constant:i64<56> t89: i64,ch = load<LD8[FixedStack1+56]> t73, t88, undef:i64 t90: ch = TokenFactor t74:1, t77:1, t78:1, t81:1, t82:1, t85:1, t86:1, t89:1 t91: ch,glue = CopyToReg t90, Register:i64 %X3, t74 t93: ch,glue = CopyToReg t91, Register:i64 %X4, t77, t91:1 t95: ch,glue = CopyToReg t93, Register:i64 %X5, t78, t93:1 t97: ch,glue = CopyToReg t95, Register:i64 %X6, t81, t95:1 t99: ch,glue = CopyToReg t97, Register:i64 %X7, t82, t97:1 t101: ch,glue = CopyToReg t99, Register:i64 %X8, t85, t99:1 t103: ch,glue = CopyToReg t101, Register:i64 %X9, t86, t101:1 t105: ch,glue = CopyToReg t103, Register:i64 %X10, t89, t103:1 t107: ch,glue = PPCISD::CALL_NOP t105, TargetGlobalAddress:i64<void (%class.F*)* @_Z3fn11F> 0, Register:i64 %X3, Register:i64 %X4, Register:i64 %X5, Register:i64 %X6, Register:i64 %X7, Register:i64 %X8, Register:i64 %X9, Register:i64 %X10, Register:i64 %X2, RegisterMask:Untyped, t105:1 t109: i64 = GlobalAddress<i32 (%class.F*)* @_ZN1F11isGlobalRegEv> 0 t108: ch,glue = callseq_end t107, TargetConstant:i64<128>, TargetConstant:i64<0>, t107:1 t110: ch,glue = callseq_start t108, TargetConstant:i64<32>, TargetConstant:i64<0> t111: ch,glue = CopyToReg t110, Register:i64 %X3, FrameIndex:i64<0> t113: ch,glue = PPCISD::CALL_NOP t111, TargetGlobalAddress:i64<i32 (%class.F*)* @_ZN1F11isGlobalRegEv> 0, Register:i64 %X3, Register:i64 %X2, RegisterMask:Untyped, t111:1 t114: ch,glue = callseq_end t113, TargetConstant:i64<32>, TargetConstant:i64<0>, t113:1 t115: i64,ch,glue = CopyFromReg t114, Register:i64 %X3, t114:1 t117: i64 = AssertSext t115, ValueType:ch:i32 t118: i32 = truncate t117 t119: ch = lifetime.end t115:1, TargetFrameIndex:i64<0> t120: ch = lifetime.end t119, TargetFrameIndex:i64<1> t121: ch = PPCISD::RET_FLAG t120 -------------- next part -------------- A non-text attachment was scrubbed... Name: MMO_Flag_Mismatch.ll Type: application/octet-stream Size: 1726 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171013/a804467a/attachment.obj>
Hal Finkel via llvm-dev
2017-Oct-26 00:28 UTC
[llvm-dev] [SelectionDAG] Assertion due to MachineMemOperand flags difference.
On 10/13/2017 06:05 PM, Sean Fertile via llvm-dev wrote:> Hello, > > I've hit an assertion in SelectionDAG where we try to merge 2 loads > that have the same operands but their MMO flags differ. One is > dereferenceable and one is not. I'm not sure what the underlying issue > here is: > > 1) MDSDNode with the same operands should have the same flags set on > their respective MMO. The fact the flags differ when the > opcode,types,operands and address-space are the same is the problem. > > 2) Having MDSDnodes with the same operands but different MMO flags is > possible, so the Flags should be added to the FoldingSetNodeID. > > 3) Something else I haven't considered. > > I have a patch posted implementing 2, but don't know if I should look > at fixing 1 as well (or perhaps instead). The loads that trigger the > assertion are: > > t47: v4i32,ch = load<LD16[%0+80](align=8)(dereferenceable)> t20, t46, undef:i64 > t69: v4i32,ch = load<LD16[FixedStack1+80](align=8)> t50, t46, undef:i64 > > I would expect the the second load should also be marked > dereferenceable since its loading from one of the TargetFrames. Am I > on the right track here?Hi, Sean, To follow-up on this, first I'll note that, IIRC, we've fixed a couple of related problems in the recent past. Specifically, the following come to mind: r309930 and r306404. In this case, it seems like something is dropping this information somewhere. Or, perhaps, I don't see why we have a "FixedStack" location that's not marked as dereferenceable. That having been said... As for whether "having MDSDnodes with the same operands but different MMO flags is possible", the answer is yes. Of the bits that are currently defined, MOLoad and MOStore are attached to fundamental properties of the corresponding SDNode. Of the remainder: MOVolatile, MONonTemporal, MODereferenceable, and MOInvariant, some of these can be set independently on a per-operation basis. For example, one can certainly have a two volatile loads of the same address, one which is non-temporal and one which is not. However, those bits are already added to the FoldingSetNodeID, because they're mirrored by the MemSDNode's MemSDNodeBits (which has bits for IsVolatile, IsNonTemporal, IsDereferenceable, and IsInvariant), and these should be returned by getRawSubclassData, which is called in all of the places that D38898 modifies). Thus, in short, I don't understand why the IsDereferenceable bit is not already included in the FoldingSetNodeID. -Hal> > Thanks > Sean > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171025/a7b43730/attachment-0001.html>
Sean Fertile via llvm-dev
2017-Oct-26 21:19 UTC
[llvm-dev] [SelectionDAG] Assertion due to MachineMemOperand flags difference.
Hi Hal, Thanks for the reply, that clears up a fair bit for me. I think the reason we have the 'FixedStack' location without it being marked as dereferenceable is because its created in CreateCopyOfByValArgument where we use a default MachinePointerInfo in the call to getMemcpy. As for fixing the assertion I think I understand now why we look up a node with different MMO flags, its a 'MemIntrinsicSDNode' which doesn't add its subclass data to the FoldingSetNode. I'll update the patch on D38898 accordingly. Thanks Sean On Wed, Oct 25, 2017 at 8:28 PM, Hal Finkel <hfinkel at anl.gov> wrote:> On 10/13/2017 06:05 PM, Sean Fertile via llvm-dev wrote: > > Hello, > > I've hit an assertion in SelectionDAG where we try to merge 2 loads > that have the same operands but their MMO flags differ. One is > dereferenceable and one is not. I'm not sure what the underlying issue > here is: > > 1) MDSDNode with the same operands should have the same flags set on > their respective MMO. The fact the flags differ when the > opcode,types,operands and address-space are the same is the problem. > > 2) Having MDSDnodes with the same operands but different MMO flags is > possible, so the Flags should be added to the FoldingSetNodeID. > > 3) Something else I haven't considered. > > I have a patch posted implementing 2, but don't know if I should look > at fixing 1 as well (or perhaps instead). The loads that trigger the > assertion are: > > t47: v4i32,ch = load<LD16[%0+80](align=8)(dereferenceable)> t20, t46, > undef:i64 > t69: v4i32,ch = load<LD16[FixedStack1+80](align=8)> t50, t46, undef:i64 > > I would expect the the second load should also be marked > dereferenceable since its loading from one of the TargetFrames. Am I > on the right track here? > > > Hi, Sean, > > To follow-up on this, first I'll note that, IIRC, we've fixed a couple of > related problems in the recent past. Specifically, the following come to > mind: r309930 and r306404. In this case, it seems like something is dropping > this information somewhere. Or, perhaps, I don't see why we have a > "FixedStack" location that's not marked as dereferenceable. That having been > said... > > As for whether "having MDSDnodes with the same operands but different MMO > flags is possible", the answer is yes. Of the bits that are currently > defined, MOLoad and MOStore are attached to fundamental properties of the > corresponding SDNode. Of the remainder: MOVolatile, MONonTemporal, > MODereferenceable, and MOInvariant, some of these can be set independently > on a per-operation basis. For example, one can certainly have a two volatile > loads of the same address, one which is non-temporal and one which is not. > However, those bits are already added to the FoldingSetNodeID, because > they're mirrored by the MemSDNode's MemSDNodeBits (which has bits for > IsVolatile, IsNonTemporal, IsDereferenceable, and IsInvariant), and these > should be returned by getRawSubclassData, which is called in all of the > places that D38898 modifies). Thus, in short, I don't understand why the > IsDereferenceable bit is not already included in the FoldingSetNodeID. > > -Hal > > > Thanks > Sean > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > -- > Hal Finkel > Lead, Compiler Technology and Programming Languages > Leadership Computing Facility > Argonne National Laboratory