Sean Fertile via llvm-dev
2017-Oct-13 23:05 UTC
[llvm-dev] [SelectionDAG] Assertion due to MachineMemOperand flags difference.
Hello,
I've hit an assertion in SelectionDAG where we try to merge 2 loads
that have the same operands but their MMO flags differ. One is
dereferenceable and one is not. I'm not sure what the underlying issue
here is:
1) MDSDNode with the same operands should have the same flags set on
their respective MMO. The fact the flags differ when the
opcode,types,operands and address-space are the same is the problem.
2) Having MDSDnodes with the same operands but different MMO flags is
possible, so the Flags should be added to the FoldingSetNodeID.
3) Something else I haven't considered.
I have a patch posted implementing 2, but don't know if I should look
at fixing 1 as well (or perhaps instead). The loads that trigger the
assertion are:
t47: v4i32,ch = load<LD16[%0+80](align=8)(dereferenceable)> t20, t46,
undef:i64
t69: v4i32,ch = load<LD16[FixedStack1+80](align=8)> t50, t46, undef:i64
I would expect the the second load should also be marked
dereferenceable since its loading from one of the TargetFrames. Am I
on the right track here?
Thanks
Sean
-------------- next part --------------
Initial selection DAG: BB#0 '_Z3fn2v:entry'
SelectionDAG has 122 nodes:
t4: i64 = GlobalAddress<void (%class.F*)* @_Z10EmitLValuev> 0
t10: i64 = add Register:i64 %X1, Constant:i64<32>
t0: ch = EntryToken
t3: ch = lifetime.start t0, TargetFrameIndex:i64<1>
t7: ch,glue = callseq_start t3, TargetConstant:i64<32>,
TargetConstant:i64<0>
t12: ch,glue = CopyToReg t7, Register:i64 %X3, FrameIndex:i64<1>
t16: ch,glue = PPCISD::CALL_NOP t12, TargetGlobalAddress:i64<void
(%class.F*)* @_Z10EmitLValuev> 0, Register:i64 %X3, Register:i64 %X2,
RegisterMask:Untyped, t12:1
t17: ch,glue = callseq_end t16, TargetConstant:i64<32>,
TargetConstant:i64<0>, t16:1
t20: ch = lifetime.start t17, TargetFrameIndex:i64<0>
t21: i64 = Constant<96>
t22: i64 = Constant<0>
t24: v4i32,ch = load<LD16[%0](align=8)(dereferenceable)> t20,
FrameIndex:i64<1>, undef:i64
t27: i64 = add FrameIndex:i64<1>, Constant:i64<16>
t28: v4i32,ch = load<LD16[%0+16](align=8)(dereferenceable)> t20, t27,
undef:i64
t31: i64 = add FrameIndex:i64<1>, Constant:i64<32>
t32: v4i32,ch = load<LD16[%0+32](align=8)(dereferenceable)> t20, t31,
undef:i64
t36: i64 = add FrameIndex:i64<1>, Constant:i64<48>
t37: v4i32,ch = load<LD16[%0+48](align=8)(dereferenceable)> t20, t36,
undef:i64
t41: i64 = add FrameIndex:i64<1>, Constant:i64<64>
t42: v4i32,ch = load<LD16[%0+64](align=8)(dereferenceable)> t20, t41,
undef:i64
t46: i64 = add FrameIndex:i64<1>, Constant:i64<80>
t47: v4i32,ch = load<LD16[%0+80](align=8)(dereferenceable)> t20, t46,
undef:i64
t25: ch = store<ST16[%1]> t20, t24, FrameIndex:i64<0>, undef:i64
t29: i64 = add FrameIndex:i64<0>, Constant:i64<16>
t30: ch = store<ST16[%1+16]> t20, t28, t29, undef:i64
t33: i64 = add FrameIndex:i64<0>, Constant:i64<32>
t34: ch = store<ST16[%1+32]> t20, t32, t33, undef:i64
t38: i64 = add FrameIndex:i64<0>, Constant:i64<48>
t39: ch = store<ST16[%1+48]> t20, t37, t38, undef:i64
t43: i64 = add FrameIndex:i64<0>, Constant:i64<64>
t44: ch = store<ST16[%1+64]> t20, t42, t43, undef:i64
t48: i64 = add FrameIndex:i64<0>, Constant:i64<80>
t49: ch = store<ST16[%1+80]> t20, t47, t48, undef:i64
t50: ch = TokenFactor t24:1, t25, t28:1, t30, t32:1, t34, t37:1, t39, t42:1,
t44, t47:1, t49
t51: i64 = GlobalAddress<void (%class.F*)* @_Z3fn11F> 0
t53: ch,glue = callseq_start t50, TargetConstant:i64<128>,
TargetConstant:i64<0>
t54: i32 = Constant<96>
t55: v4i32,ch = load<LD16[FixedStack1](align=8)> t50,
FrameIndex:i64<1>, undef:i64
t57: v4i32,ch = load<LD16[FixedStack1+16](align=8)> t50, t27, undef:i64
t60: v4i32,ch = load<LD16[FixedStack1+32](align=8)> t50, t31, undef:i64
t63: v4i32,ch = load<LD16[FixedStack1+48](align=8)> t50, t36, undef:i64
t66: v4i32,ch = load<LD16[FixedStack1+64](align=8)> t50, t41, undef:i64
t69: v4i32,ch = load<LD16[FixedStack1+80](align=8)> t50, t46, undef:i64
t56: ch = store<ST16[<unknown>](align=8)> t50, t55, t10,
undef:i64
t58: i64 = add t10, Constant:i64<16>
t59: ch = store<ST16[<unknown>](align=8)> t50, t57, t58,
undef:i64
t61: i64 = add t10, Constant:i64<32>
t62: ch = store<ST16[<unknown>](align=8)> t50, t60, t61,
undef:i64
t64: i64 = add t10, Constant:i64<48>
t65: ch = store<ST16[<unknown>](align=8)> t50, t63, t64,
undef:i64
t67: i64 = add t10, Constant:i64<64>
t68: ch = store<ST16[<unknown>](align=8)> t50, t66, t67,
undef:i64
t70: i64 = add t10, Constant:i64<80>
t71: ch = store<ST16[<unknown>](align=8)> t50, t69, t70,
undef:i64
t72: ch = TokenFactor t55:1, t56, t57:1, t59, t60:1, t62, t63:1, t65, t66:1,
t68, t69:1, t71
t73: ch,glue = callseq_start t72, TargetConstant:i64<128>,
TargetConstant:i64<0>
t74: i64,ch = load<LD8[FixedStack1]> t73, FrameIndex:i64<1>,
undef:i64
t76: i64 = add FrameIndex:i64<1>, Constant:i64<8>
t77: i64,ch = load<LD8[FixedStack1+8]> t73, t76, undef:i64
t78: i64,ch = load<LD8[FixedStack1+16]> t73, t27, undef:i64
t80: i64 = add FrameIndex:i64<1>, Constant:i64<24>
t81: i64,ch = load<LD8[FixedStack1+24]> t73, t80, undef:i64
t82: i64,ch = load<LD8[FixedStack1+32]> t73, t31, undef:i64
t84: i64 = add FrameIndex:i64<1>, Constant:i64<40>
t85: i64,ch = load<LD8[FixedStack1+40]> t73, t84, undef:i64
t86: i64,ch = load<LD8[FixedStack1+48]> t73, t36, undef:i64
t88: i64 = add FrameIndex:i64<1>, Constant:i64<56>
t89: i64,ch = load<LD8[FixedStack1+56]> t73, t88, undef:i64
t90: ch = TokenFactor t74:1, t77:1, t78:1, t81:1, t82:1, t85:1, t86:1, t89:1
t91: ch,glue = CopyToReg t90, Register:i64 %X3, t74
t93: ch,glue = CopyToReg t91, Register:i64 %X4, t77, t91:1
t95: ch,glue = CopyToReg t93, Register:i64 %X5, t78, t93:1
t97: ch,glue = CopyToReg t95, Register:i64 %X6, t81, t95:1
t99: ch,glue = CopyToReg t97, Register:i64 %X7, t82, t97:1
t101: ch,glue = CopyToReg t99, Register:i64 %X8, t85, t99:1
t103: ch,glue = CopyToReg t101, Register:i64 %X9, t86, t101:1
t105: ch,glue = CopyToReg t103, Register:i64 %X10, t89, t103:1
t107: ch,glue = PPCISD::CALL_NOP t105, TargetGlobalAddress:i64<void
(%class.F*)* @_Z3fn11F> 0, Register:i64 %X3, Register:i64 %X4, Register:i64
%X5, Register:i64 %X6, Register:i64 %X7, Register:i64 %X8, Register:i64 %X9,
Register:i64 %X10, Register:i64 %X2, RegisterMask:Untyped, t105:1
t109: i64 = GlobalAddress<i32 (%class.F*)* @_ZN1F11isGlobalRegEv> 0
t108: ch,glue = callseq_end t107, TargetConstant:i64<128>,
TargetConstant:i64<0>, t107:1
t110: ch,glue = callseq_start t108, TargetConstant:i64<32>,
TargetConstant:i64<0>
t111: ch,glue = CopyToReg t110, Register:i64 %X3, FrameIndex:i64<0>
t113: ch,glue = PPCISD::CALL_NOP t111, TargetGlobalAddress:i64<i32
(%class.F*)* @_ZN1F11isGlobalRegEv> 0, Register:i64 %X3, Register:i64 %X2,
RegisterMask:Untyped, t111:1
t114: ch,glue = callseq_end t113, TargetConstant:i64<32>,
TargetConstant:i64<0>, t113:1
t115: i64,ch,glue = CopyFromReg t114, Register:i64 %X3, t114:1
t117: i64 = AssertSext t115, ValueType:ch:i32
t118: i32 = truncate t117
t119: ch = lifetime.end t115:1, TargetFrameIndex:i64<0>
t120: ch = lifetime.end t119, TargetFrameIndex:i64<1>
t121: ch = PPCISD::RET_FLAG t120
-------------- next part --------------
A non-text attachment was scrubbed...
Name: MMO_Flag_Mismatch.ll
Type: application/octet-stream
Size: 1726 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20171013/a804467a/attachment.obj>
Hal Finkel via llvm-dev
2017-Oct-26 00:28 UTC
[llvm-dev] [SelectionDAG] Assertion due to MachineMemOperand flags difference.
On 10/13/2017 06:05 PM, Sean Fertile via llvm-dev wrote:> Hello, > > I've hit an assertion in SelectionDAG where we try to merge 2 loads > that have the same operands but their MMO flags differ. One is > dereferenceable and one is not. I'm not sure what the underlying issue > here is: > > 1) MDSDNode with the same operands should have the same flags set on > their respective MMO. The fact the flags differ when the > opcode,types,operands and address-space are the same is the problem. > > 2) Having MDSDnodes with the same operands but different MMO flags is > possible, so the Flags should be added to the FoldingSetNodeID. > > 3) Something else I haven't considered. > > I have a patch posted implementing 2, but don't know if I should look > at fixing 1 as well (or perhaps instead). The loads that trigger the > assertion are: > > t47: v4i32,ch = load<LD16[%0+80](align=8)(dereferenceable)> t20, t46, undef:i64 > t69: v4i32,ch = load<LD16[FixedStack1+80](align=8)> t50, t46, undef:i64 > > I would expect the the second load should also be marked > dereferenceable since its loading from one of the TargetFrames. Am I > on the right track here?Hi, Sean, To follow-up on this, first I'll note that, IIRC, we've fixed a couple of related problems in the recent past. Specifically, the following come to mind: r309930 and r306404. In this case, it seems like something is dropping this information somewhere. Or, perhaps, I don't see why we have a "FixedStack" location that's not marked as dereferenceable. That having been said... As for whether "having MDSDnodes with the same operands but different MMO flags is possible", the answer is yes. Of the bits that are currently defined, MOLoad and MOStore are attached to fundamental properties of the corresponding SDNode. Of the remainder: MOVolatile, MONonTemporal, MODereferenceable, and MOInvariant, some of these can be set independently on a per-operation basis. For example, one can certainly have a two volatile loads of the same address, one which is non-temporal and one which is not. However, those bits are already added to the FoldingSetNodeID, because they're mirrored by the MemSDNode's MemSDNodeBits (which has bits for IsVolatile, IsNonTemporal, IsDereferenceable, and IsInvariant), and these should be returned by getRawSubclassData, which is called in all of the places that D38898 modifies). Thus, in short, I don't understand why the IsDereferenceable bit is not already included in the FoldingSetNodeID. -Hal> > Thanks > Sean > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171025/a7b43730/attachment-0001.html>
Sean Fertile via llvm-dev
2017-Oct-26 21:19 UTC
[llvm-dev] [SelectionDAG] Assertion due to MachineMemOperand flags difference.
Hi Hal, Thanks for the reply, that clears up a fair bit for me. I think the reason we have the 'FixedStack' location without it being marked as dereferenceable is because its created in CreateCopyOfByValArgument where we use a default MachinePointerInfo in the call to getMemcpy. As for fixing the assertion I think I understand now why we look up a node with different MMO flags, its a 'MemIntrinsicSDNode' which doesn't add its subclass data to the FoldingSetNode. I'll update the patch on D38898 accordingly. Thanks Sean On Wed, Oct 25, 2017 at 8:28 PM, Hal Finkel <hfinkel at anl.gov> wrote:> On 10/13/2017 06:05 PM, Sean Fertile via llvm-dev wrote: > > Hello, > > I've hit an assertion in SelectionDAG where we try to merge 2 loads > that have the same operands but their MMO flags differ. One is > dereferenceable and one is not. I'm not sure what the underlying issue > here is: > > 1) MDSDNode with the same operands should have the same flags set on > their respective MMO. The fact the flags differ when the > opcode,types,operands and address-space are the same is the problem. > > 2) Having MDSDnodes with the same operands but different MMO flags is > possible, so the Flags should be added to the FoldingSetNodeID. > > 3) Something else I haven't considered. > > I have a patch posted implementing 2, but don't know if I should look > at fixing 1 as well (or perhaps instead). The loads that trigger the > assertion are: > > t47: v4i32,ch = load<LD16[%0+80](align=8)(dereferenceable)> t20, t46, > undef:i64 > t69: v4i32,ch = load<LD16[FixedStack1+80](align=8)> t50, t46, undef:i64 > > I would expect the the second load should also be marked > dereferenceable since its loading from one of the TargetFrames. Am I > on the right track here? > > > Hi, Sean, > > To follow-up on this, first I'll note that, IIRC, we've fixed a couple of > related problems in the recent past. Specifically, the following come to > mind: r309930 and r306404. In this case, it seems like something is dropping > this information somewhere. Or, perhaps, I don't see why we have a > "FixedStack" location that's not marked as dereferenceable. That having been > said... > > As for whether "having MDSDnodes with the same operands but different MMO > flags is possible", the answer is yes. Of the bits that are currently > defined, MOLoad and MOStore are attached to fundamental properties of the > corresponding SDNode. Of the remainder: MOVolatile, MONonTemporal, > MODereferenceable, and MOInvariant, some of these can be set independently > on a per-operation basis. For example, one can certainly have a two volatile > loads of the same address, one which is non-temporal and one which is not. > However, those bits are already added to the FoldingSetNodeID, because > they're mirrored by the MemSDNode's MemSDNodeBits (which has bits for > IsVolatile, IsNonTemporal, IsDereferenceable, and IsInvariant), and these > should be returned by getRawSubclassData, which is called in all of the > places that D38898 modifies). Thus, in short, I don't understand why the > IsDereferenceable bit is not already included in the FoldingSetNodeID. > > -Hal > > > Thanks > Sean > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > -- > Hal Finkel > Lead, Compiler Technology and Programming Languages > Leadership Computing Facility > Argonne National Laboratory