Alok Sharma via llvm-dev
2019-Nov-28 19:29 UTC
[llvm-dev] DW_OP_implicit_pointer design/implementation in general
Hi folks, I am pushing a PoC patch https://reviews.llvm.org/D70833 for review which includes the case when temporary is promoted. For such cases it generates IR as call void @llvm.dbg.derefval(metadata i32 3, metadata !25, metadata !DIExpression(DW_OP_LLVM_explicit_pointer, DW_OP_LLVM_arg0)), !dbg !32 And llvm-darfdump output looks like ------------- 0x0000007b: DW_TAG_inlined_subroutine DW_AT_abstract_origin (0x0000004f "_Z4sinkRKi") DW_AT_low_pc (0x00000000004004c6) DW_AT_high_pc (0x00000000004004d0) DW_AT_call_file ("/home/alok/openllvm/llvm-project_derefval/build.d/david.cc") DW_AT_call_line (10) DW_AT_call_column (0x03) 0x00000088: DW_TAG_formal_parameter DW_AT_location (indexed (0x0) loclist = 0x00000010: [0x00000000004004c6, 0x00000000004004d4): DW_OP_explicit_pointer, DW_OP_lit3) DW_AT_abstract_origin (0x00000055 "p") ------------ Please note that DW_OP_explicit_pointer denotes that following value represents de-referenced value of optimized out pointer. With necessary changes in LLDB debugger this dwarf info can help to detect the explicit de-referenced value of 'p'. Hi David, Should we keep on working for the above case separately and resume the review of implicit pointer independently now, which is updated with many suggestions from this discussion? Regards, Alok On Wed, Nov 20, 2019 at 11:24 PM Jeremy Morse <jeremy.morse.llvm at gmail.com> wrote:> Hi, > > For a new way of representing things, > > Adrian wrote: > > llvm.dbg.value_new(DILocalVariable("y"), DIExpression(DW_OP_LLVM_arg0, > DW_OP_LLVM_arg1, DW_OP_plus), > > %ptr, %ofs) > > I think this would be great -- there're definitely some constructs > created by the induction-variables pass and similar where one could > recover an implicit variable value, if you could for example subtract > one pointer from another. > > With the current model of storing DIExpressions as a vector of > opcodes, it might become a pain to salvage a Value that gets optimised > out --in the example, if %ofs were salvaged, presumably > DW_OP_LLVM_arg1 could have to be replaced with several extra > operations. This isn't insurmountable, but I've repeatedly shied away > from scanning through DIExpressions to patch them up. A vector of > opcodes is the final output of the compiler, IMHO richer metadata > should be used in the meantime. > > IMHO the implicit pointer work doesn't need to block on this. As said > my mild preference would be for a new intrinsic for this form of > variable location. > > ~ > > Inre PR37682, > > > I’ve been reminded of PR37682, where a function with a reference > parameter might spend all its time computing the “referenced” value in a > temp, and only move the final value back to the referenced object at the > end. This is clearly a situation that could benefit from > DW_OP_implicit_pointer, and there is really no other-object DIE for it to > refer to. Given the current spec, the compiler would need to produce a > DW_TAG_dwarf_procedure for the parameter DIE to refer to. Appendix D > (Figure D.61) has an example of this construction, although it’s a more > contrived source example. > > This has been working through my mind too, and I think it's slightly > different to what implicit_pointer is trying to achieve. In the case > implicit_pointer is designed for, it's a strict improvement in debug > experience because you're recovering information that couldn't be > expressed. However for PR37682 it's a trade-off between whether the > user might want to examine the pointer, or the pointed-at integer: > AFAIUI, we can only express one of the two, not both. Wheras for > mem2reg'd variables referred to by DIE, there is never a pointer to be > lost. > > I think my preference would always be to see temporarily-promoted > values as there's no other way of observing them, but others might > disagree. > > -- > Thanks, > Jeremy >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191129/368d2139/attachment.html>
David Blaikie via llvm-dev
2019-Nov-28 23:09 UTC
[llvm-dev] DW_OP_implicit_pointer design/implementation in general
Sorry I haven't been more engaged with this thread, I have been reading it, so hopefully my reply isn't completely out of line/irrelevant - but I still feel like having a custom dwarf expression operator (& no new intrinsics), like we have for one or two other DW_OP_LLVM_* (that aren't actually generated into the DWARF - though this one perhaps could be in some/all cases as an extension, maybe - or a synthesized variable could be created for compatibility with the current DWARF standard) would make the most sense. Some thought experiments that I think are relevant: * does the proposed IR format scale to pointers that don't point to existing variables (that I think has already been touched on in this thread) * does the proposed IR format support multiple layers of dereference (eg: int ** where we know it ultimately points to the value 3 but can't describe either the first or second level pointers that get to that value) - it sounds like any intrinsic that's special cased to deref (like llvm.dbg.derefval) wouldn't be able to capture that, which seems like it's overly narrow/special case, then? On Thu, Nov 28, 2019 at 2:29 PM Alok Sharma via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi folks, > > I am pushing a PoC patch https://reviews.llvm.org/D70833 for review which > includes the case when temporary is promoted. > > For such cases it generates IR as > > call void @llvm.dbg.derefval(metadata i32 3, metadata !25, metadata > !DIExpression(DW_OP_LLVM_explicit_pointer, DW_OP_LLVM_arg0)), !dbg !32 > > And llvm-darfdump output looks like > > ------------- > 0x0000007b: DW_TAG_inlined_subroutine > DW_AT_abstract_origin (0x0000004f "_Z4sinkRKi") > DW_AT_low_pc (0x00000000004004c6) > DW_AT_high_pc (0x00000000004004d0) > DW_AT_call_file > ("/home/alok/openllvm/llvm-project_derefval/build.d/david.cc") > DW_AT_call_line (10) > DW_AT_call_column (0x03) > > 0x00000088: DW_TAG_formal_parameter > DW_AT_location (indexed (0x0) loclist > 0x00000010: > [0x00000000004004c6, 0x00000000004004d4): > DW_OP_explicit_pointer, DW_OP_lit3) > DW_AT_abstract_origin (0x00000055 "p") > ------------ > > Please note that DW_OP_explicit_pointer denotes that following value > represents de-referenced value of optimized out pointer. With necessary > changes in LLDB debugger this dwarf info can help to detect the explicit > de-referenced value of 'p'. > > Hi David, > > Should we keep on working for the above case separately and resume the > review of implicit pointer independently now, which is updated with many > suggestions from this discussion? > > Regards, > Alok > > > On Wed, Nov 20, 2019 at 11:24 PM Jeremy Morse <jeremy.morse.llvm at gmail.com> > wrote: > >> Hi, >> >> For a new way of representing things, >> >> Adrian wrote: >> > llvm.dbg.value_new(DILocalVariable("y"), DIExpression(DW_OP_LLVM_arg0, >> DW_OP_LLVM_arg1, DW_OP_plus), >> > %ptr, %ofs) >> >> I think this would be great -- there're definitely some constructs >> created by the induction-variables pass and similar where one could >> recover an implicit variable value, if you could for example subtract >> one pointer from another. >> >> With the current model of storing DIExpressions as a vector of >> opcodes, it might become a pain to salvage a Value that gets optimised >> out --in the example, if %ofs were salvaged, presumably >> DW_OP_LLVM_arg1 could have to be replaced with several extra >> operations. This isn't insurmountable, but I've repeatedly shied away >> from scanning through DIExpressions to patch them up. A vector of >> opcodes is the final output of the compiler, IMHO richer metadata >> should be used in the meantime. >> >> IMHO the implicit pointer work doesn't need to block on this. As said >> my mild preference would be for a new intrinsic for this form of >> variable location. >> >> ~ >> >> Inre PR37682, >> >> > I’ve been reminded of PR37682, where a function with a reference >> parameter might spend all its time computing the “referenced” value in a >> temp, and only move the final value back to the referenced object at the >> end. This is clearly a situation that could benefit from >> DW_OP_implicit_pointer, and there is really no other-object DIE for it to >> refer to. Given the current spec, the compiler would need to produce a >> DW_TAG_dwarf_procedure for the parameter DIE to refer to. Appendix D >> (Figure D.61) has an example of this construction, although it’s a more >> contrived source example. >> >> This has been working through my mind too, and I think it's slightly >> different to what implicit_pointer is trying to achieve. In the case >> implicit_pointer is designed for, it's a strict improvement in debug >> experience because you're recovering information that couldn't be >> expressed. However for PR37682 it's a trade-off between whether the >> user might want to examine the pointer, or the pointed-at integer: >> AFAIUI, we can only express one of the two, not both. Wheras for >> mem2reg'd variables referred to by DIE, there is never a pointer to be >> lost. >> >> I think my preference would always be to see temporarily-promoted >> values as there's no other way of observing them, but others might >> disagree. >> >> -- >> Thanks, >> Jeremy >> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191128/942eeff3/attachment.html>
Alok Sharma via llvm-dev
2019-Nov-29 04:42 UTC
[llvm-dev] DW_OP_implicit_pointer design/implementation in general
Let me try to summarize the implementation first. At the moment, there are two branches. 1. When an existing variable is optimized out and that variable is used to get the de-refereced value, pointed to by another pointer/reference variable. Such cases are being addressed using Dwarf expression DW_OP_implicit_pointer as de-referenced value of a pointer can be seen implicitly (using another variable). Before Dwarf is dumped in LLVM IR, we represent it using dbg.derefval (which denotes derefereced value of pointer or reference) and DW_OP_LLVM_implicit_pointer operation. 2. When a temporary variable is optimized out and that variable is used to get de-referenced value of another reference variable (AFAIK it can not be reproduced with pointers) Such cases are being addressed using new Dwarf expression DW_OP_explicit_pointer as de-referenced value can be displayed explicitly (in place). In LLVM IR, we represent it using dbg.derefval and DW_OP_LLVM_explicit_pointer operation. Both of these two branches have some common implementation to define new operations (Dwarf and IR). (D70642, D70643, D69999, D69886). First branch has additional patches (D70260, 70384, D70385, D70419). Second branch has additional patch ( D70833). Let me try to comment on points raised by you. - Branch 2, (patch D70833) handles cases when temporaries (not existing variables) are optimized out. - In patch D70385, I have included test points to display that multi layered pointers are working (llvm/test/DebugInfo/dwarfdump-implicit_pointer_mem2reg.c). I feel that review of branch 1 (implicit pointer) can be resumed (which was halted due to current discussion), while we can continue to discuss branch 2 (explicit pointers D7083) if you want. David, what do you think? Regards, Alok On Fri, Nov 29, 2019 at 4:40 AM David Blaikie <dblaikie at gmail.com> wrote:> Sorry I haven't been more engaged with this thread, I have been reading > it, so hopefully my reply isn't completely out of line/irrelevant - but I > still feel like having a custom dwarf expression operator (& no new > intrinsics), like we have for one or two other DW_OP_LLVM_* (that aren't > actually generated into the DWARF - though this one perhaps could be in > some/all cases as an extension, maybe - or a synthesized variable could be > created for compatibility with the current DWARF standard) would make the > most sense. > > Some thought experiments that I think are relevant: > * does the proposed IR format scale to pointers that don't point to > existing variables (that I think has already been touched on in this thread) > * does the proposed IR format support multiple layers of dereference (eg: > int ** where we know it ultimately points to the value 3 but can't describe > either the first or second level pointers that get to that value) - it > sounds like any intrinsic that's special cased to deref (like > llvm.dbg.derefval) wouldn't be able to capture that, which seems like it's > overly narrow/special case, then? > > On Thu, Nov 28, 2019 at 2:29 PM Alok Sharma via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hi folks, >> >> I am pushing a PoC patch https://reviews.llvm.org/D70833 for review >> which includes the case when temporary is promoted. >> >> For such cases it generates IR as >> >> call void @llvm.dbg.derefval(metadata i32 3, metadata !25, metadata >> !DIExpression(DW_OP_LLVM_explicit_pointer, DW_OP_LLVM_arg0)), !dbg !32 >> >> And llvm-darfdump output looks like >> >> ------------- >> 0x0000007b: DW_TAG_inlined_subroutine >> DW_AT_abstract_origin (0x0000004f "_Z4sinkRKi") >> DW_AT_low_pc (0x00000000004004c6) >> DW_AT_high_pc (0x00000000004004d0) >> DW_AT_call_file >> ("/home/alok/openllvm/llvm-project_derefval/build.d/david.cc") >> DW_AT_call_line (10) >> DW_AT_call_column (0x03) >> >> 0x00000088: DW_TAG_formal_parameter >> DW_AT_location (indexed (0x0) loclist >> 0x00000010: >> [0x00000000004004c6, 0x00000000004004d4): >> DW_OP_explicit_pointer, DW_OP_lit3) >> DW_AT_abstract_origin (0x00000055 "p") >> ------------ >> >> Please note that DW_OP_explicit_pointer denotes that following value >> represents de-referenced value of optimized out pointer. With necessary >> changes in LLDB debugger this dwarf info can help to detect the explicit >> de-referenced value of 'p'. >> >> Hi David, >> >> Should we keep on working for the above case separately and resume the >> review of implicit pointer independently now, which is updated with many >> suggestions from this discussion? >> >> Regards, >> Alok >> >> >> On Wed, Nov 20, 2019 at 11:24 PM Jeremy Morse < >> jeremy.morse.llvm at gmail.com> wrote: >> >>> Hi, >>> >>> For a new way of representing things, >>> >>> Adrian wrote: >>> > llvm.dbg.value_new(DILocalVariable("y"), DIExpression(DW_OP_LLVM_arg0, >>> DW_OP_LLVM_arg1, DW_OP_plus), >>> > %ptr, %ofs) >>> >>> I think this would be great -- there're definitely some constructs >>> created by the induction-variables pass and similar where one could >>> recover an implicit variable value, if you could for example subtract >>> one pointer from another. >>> >>> With the current model of storing DIExpressions as a vector of >>> opcodes, it might become a pain to salvage a Value that gets optimised >>> out --in the example, if %ofs were salvaged, presumably >>> DW_OP_LLVM_arg1 could have to be replaced with several extra >>> operations. This isn't insurmountable, but I've repeatedly shied away >>> from scanning through DIExpressions to patch them up. A vector of >>> opcodes is the final output of the compiler, IMHO richer metadata >>> should be used in the meantime. >>> >>> IMHO the implicit pointer work doesn't need to block on this. As said >>> my mild preference would be for a new intrinsic for this form of >>> variable location. >>> >>> ~ >>> >>> Inre PR37682, >>> >>> > I’ve been reminded of PR37682, where a function with a reference >>> parameter might spend all its time computing the “referenced” value in a >>> temp, and only move the final value back to the referenced object at the >>> end. This is clearly a situation that could benefit from >>> DW_OP_implicit_pointer, and there is really no other-object DIE for it to >>> refer to. Given the current spec, the compiler would need to produce a >>> DW_TAG_dwarf_procedure for the parameter DIE to refer to. Appendix D >>> (Figure D.61) has an example of this construction, although it’s a more >>> contrived source example. >>> >>> This has been working through my mind too, and I think it's slightly >>> different to what implicit_pointer is trying to achieve. In the case >>> implicit_pointer is designed for, it's a strict improvement in debug >>> experience because you're recovering information that couldn't be >>> expressed. However for PR37682 it's a trade-off between whether the >>> user might want to examine the pointer, or the pointed-at integer: >>> AFAIUI, we can only express one of the two, not both. Wheras for >>> mem2reg'd variables referred to by DIE, there is never a pointer to be >>> lost. >>> >>> I think my preference would always be to see temporarily-promoted >>> values as there's no other way of observing them, but others might >>> disagree. >>> >>> -- >>> Thanks, >>> Jeremy >>> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191129/daf9c5d4/attachment.html>