Alok Sharma via llvm-dev
2019-Nov-29 04:42 UTC
[llvm-dev] DW_OP_implicit_pointer design/implementation in general
Let me try to summarize the implementation first. At the moment, there are two branches. 1. When an existing variable is optimized out and that variable is used to get the de-refereced value, pointed to by another pointer/reference variable. Such cases are being addressed using Dwarf expression DW_OP_implicit_pointer as de-referenced value of a pointer can be seen implicitly (using another variable). Before Dwarf is dumped in LLVM IR, we represent it using dbg.derefval (which denotes derefereced value of pointer or reference) and DW_OP_LLVM_implicit_pointer operation. 2. When a temporary variable is optimized out and that variable is used to get de-referenced value of another reference variable (AFAIK it can not be reproduced with pointers) Such cases are being addressed using new Dwarf expression DW_OP_explicit_pointer as de-referenced value can be displayed explicitly (in place). In LLVM IR, we represent it using dbg.derefval and DW_OP_LLVM_explicit_pointer operation. Both of these two branches have some common implementation to define new operations (Dwarf and IR). (D70642, D70643, D69999, D69886). First branch has additional patches (D70260, 70384, D70385, D70419). Second branch has additional patch ( D70833). Let me try to comment on points raised by you. - Branch 2, (patch D70833) handles cases when temporaries (not existing variables) are optimized out. - In patch D70385, I have included test points to display that multi layered pointers are working (llvm/test/DebugInfo/dwarfdump-implicit_pointer_mem2reg.c). I feel that review of branch 1 (implicit pointer) can be resumed (which was halted due to current discussion), while we can continue to discuss branch 2 (explicit pointers D7083) if you want. David, what do you think? Regards, Alok On Fri, Nov 29, 2019 at 4:40 AM David Blaikie <dblaikie at gmail.com> wrote:> Sorry I haven't been more engaged with this thread, I have been reading > it, so hopefully my reply isn't completely out of line/irrelevant - but I > still feel like having a custom dwarf expression operator (& no new > intrinsics), like we have for one or two other DW_OP_LLVM_* (that aren't > actually generated into the DWARF - though this one perhaps could be in > some/all cases as an extension, maybe - or a synthesized variable could be > created for compatibility with the current DWARF standard) would make the > most sense. > > Some thought experiments that I think are relevant: > * does the proposed IR format scale to pointers that don't point to > existing variables (that I think has already been touched on in this thread) > * does the proposed IR format support multiple layers of dereference (eg: > int ** where we know it ultimately points to the value 3 but can't describe > either the first or second level pointers that get to that value) - it > sounds like any intrinsic that's special cased to deref (like > llvm.dbg.derefval) wouldn't be able to capture that, which seems like it's > overly narrow/special case, then? > > On Thu, Nov 28, 2019 at 2:29 PM Alok Sharma via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Hi folks, >> >> I am pushing a PoC patch https://reviews.llvm.org/D70833 for review >> which includes the case when temporary is promoted. >> >> For such cases it generates IR as >> >> call void @llvm.dbg.derefval(metadata i32 3, metadata !25, metadata >> !DIExpression(DW_OP_LLVM_explicit_pointer, DW_OP_LLVM_arg0)), !dbg !32 >> >> And llvm-darfdump output looks like >> >> ------------- >> 0x0000007b: DW_TAG_inlined_subroutine >> DW_AT_abstract_origin (0x0000004f "_Z4sinkRKi") >> DW_AT_low_pc (0x00000000004004c6) >> DW_AT_high_pc (0x00000000004004d0) >> DW_AT_call_file >> ("/home/alok/openllvm/llvm-project_derefval/build.d/david.cc") >> DW_AT_call_line (10) >> DW_AT_call_column (0x03) >> >> 0x00000088: DW_TAG_formal_parameter >> DW_AT_location (indexed (0x0) loclist >> 0x00000010: >> [0x00000000004004c6, 0x00000000004004d4): >> DW_OP_explicit_pointer, DW_OP_lit3) >> DW_AT_abstract_origin (0x00000055 "p") >> ------------ >> >> Please note that DW_OP_explicit_pointer denotes that following value >> represents de-referenced value of optimized out pointer. With necessary >> changes in LLDB debugger this dwarf info can help to detect the explicit >> de-referenced value of 'p'. >> >> Hi David, >> >> Should we keep on working for the above case separately and resume the >> review of implicit pointer independently now, which is updated with many >> suggestions from this discussion? >> >> Regards, >> Alok >> >> >> On Wed, Nov 20, 2019 at 11:24 PM Jeremy Morse < >> jeremy.morse.llvm at gmail.com> wrote: >> >>> Hi, >>> >>> For a new way of representing things, >>> >>> Adrian wrote: >>> > llvm.dbg.value_new(DILocalVariable("y"), DIExpression(DW_OP_LLVM_arg0, >>> DW_OP_LLVM_arg1, DW_OP_plus), >>> > %ptr, %ofs) >>> >>> I think this would be great -- there're definitely some constructs >>> created by the induction-variables pass and similar where one could >>> recover an implicit variable value, if you could for example subtract >>> one pointer from another. >>> >>> With the current model of storing DIExpressions as a vector of >>> opcodes, it might become a pain to salvage a Value that gets optimised >>> out --in the example, if %ofs were salvaged, presumably >>> DW_OP_LLVM_arg1 could have to be replaced with several extra >>> operations. This isn't insurmountable, but I've repeatedly shied away >>> from scanning through DIExpressions to patch them up. A vector of >>> opcodes is the final output of the compiler, IMHO richer metadata >>> should be used in the meantime. >>> >>> IMHO the implicit pointer work doesn't need to block on this. As said >>> my mild preference would be for a new intrinsic for this form of >>> variable location. >>> >>> ~ >>> >>> Inre PR37682, >>> >>> > I’ve been reminded of PR37682, where a function with a reference >>> parameter might spend all its time computing the “referenced” value in a >>> temp, and only move the final value back to the referenced object at the >>> end. This is clearly a situation that could benefit from >>> DW_OP_implicit_pointer, and there is really no other-object DIE for it to >>> refer to. Given the current spec, the compiler would need to produce a >>> DW_TAG_dwarf_procedure for the parameter DIE to refer to. Appendix D >>> (Figure D.61) has an example of this construction, although it’s a more >>> contrived source example. >>> >>> This has been working through my mind too, and I think it's slightly >>> different to what implicit_pointer is trying to achieve. In the case >>> implicit_pointer is designed for, it's a strict improvement in debug >>> experience because you're recovering information that couldn't be >>> expressed. However for PR37682 it's a trade-off between whether the >>> user might want to examine the pointer, or the pointed-at integer: >>> AFAIUI, we can only express one of the two, not both. Wheras for >>> mem2reg'd variables referred to by DIE, there is never a pointer to be >>> lost. >>> >>> I think my preference would always be to see temporarily-promoted >>> values as there's no other way of observing them, but others might >>> disagree. >>> >>> -- >>> Thanks, >>> Jeremy >>> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191129/daf9c5d4/attachment.html>
Alok Sharma via llvm-dev
2019-Dec-11 05:06 UTC
[llvm-dev] DW_OP_implicit_pointer design/implementation in general
Hi David, This is regarding missing multilevel handling in branch for explicit pointers.> * does the proposed IR format support multiple layers of dereference (eg:int ** where we know it ultimately points to the value 3 but can't describe either the first or second level pointers that get to that value) - it sounds like any intrinsic that's special cased to deref (like llvm.dbg.derefval) wouldn't be able to capture that, which seems like it's overly narrow/special case, then? The PoC of DW_OP_LLVM_explicit_pointer does not have handling of multilevel indirection. As of now it is so due to below reason. Explicit pointer handles cases when variable points to a temporary which contains constant. Due to language standard constraints, we don't find pointers in such cases, what we get is references. Unlike pointers, references have single level. (reference to reference is just reference while pointer to pointer is double pointer). Case of reference to reference, second level can be handled using DW_OP_LLVM_explicit_pointer itself. Case of pointer to reference, second level can be handled using DW_OP_implicit_pointer. Though it would not be complex to make explicit pointer multilevel, I avoided so due to lack of use case. Please let me know if I am missing something. Regards, Alok On Fri, Nov 29, 2019 at 10:12 AM Alok Sharma <aloksharma.knit at gmail.com> wrote:> Let me try to summarize the implementation first. > > At the moment, there are two branches. > > 1. When an existing variable is optimized out and that variable is used to > get the de-refereced value, pointed to by another pointer/reference > variable. > Such cases are being addressed using Dwarf expression > DW_OP_implicit_pointer as de-referenced value of a pointer can be seen > implicitly (using another variable). Before Dwarf is dumped in LLVM IR, we > represent it using dbg.derefval (which denotes derefereced value of pointer > or reference) and DW_OP_LLVM_implicit_pointer operation. > > 2. When a temporary variable is optimized out and that variable is used to > get de-referenced value of another reference variable (AFAIK it can not be > reproduced with pointers) > Such cases are being addressed using new Dwarf expression > DW_OP_explicit_pointer as de-referenced value can be displayed explicitly > (in place). In LLVM IR, we represent it using dbg.derefval and > DW_OP_LLVM_explicit_pointer operation. > > Both of these two branches have some common implementation to define new > operations (Dwarf and IR). (D70642, D70643, D69999, D69886). > First branch has additional patches (D70260, 70384, D70385, D70419). > Second branch has additional patch ( D70833). > > Let me try to comment on points raised by you. > - Branch 2, (patch D70833) handles cases when temporaries (not existing > variables) are optimized out. > - In patch D70385, I have included test points to display that multi > layered pointers are working > (llvm/test/DebugInfo/dwarfdump-implicit_pointer_mem2reg.c). > > I feel that review of branch 1 (implicit pointer) can be resumed (which > was halted due to current discussion), while we can continue to discuss > branch 2 (explicit pointers D7083) if you want. David, what do you think? > > Regards, > Alok > > On Fri, Nov 29, 2019 at 4:40 AM David Blaikie <dblaikie at gmail.com> wrote: > >> Sorry I haven't been more engaged with this thread, I have been reading >> it, so hopefully my reply isn't completely out of line/irrelevant - but I >> still feel like having a custom dwarf expression operator (& no new >> intrinsics), like we have for one or two other DW_OP_LLVM_* (that aren't >> actually generated into the DWARF - though this one perhaps could be in >> some/all cases as an extension, maybe - or a synthesized variable could be >> created for compatibility with the current DWARF standard) would make the >> most sense. >> >> Some thought experiments that I think are relevant: >> * does the proposed IR format scale to pointers that don't point to >> existing variables (that I think has already been touched on in this thread) >> * does the proposed IR format support multiple layers of dereference (eg: >> int ** where we know it ultimately points to the value 3 but can't describe >> either the first or second level pointers that get to that value) - it >> sounds like any intrinsic that's special cased to deref (like >> llvm.dbg.derefval) wouldn't be able to capture that, which seems like it's >> overly narrow/special case, then? >> >> On Thu, Nov 28, 2019 at 2:29 PM Alok Sharma via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> Hi folks, >>> >>> I am pushing a PoC patch https://reviews.llvm.org/D70833 for review >>> which includes the case when temporary is promoted. >>> >>> For such cases it generates IR as >>> >>> call void @llvm.dbg.derefval(metadata i32 3, metadata !25, metadata >>> !DIExpression(DW_OP_LLVM_explicit_pointer, DW_OP_LLVM_arg0)), !dbg !32 >>> >>> And llvm-darfdump output looks like >>> >>> ------------- >>> 0x0000007b: DW_TAG_inlined_subroutine >>> DW_AT_abstract_origin (0x0000004f "_Z4sinkRKi") >>> DW_AT_low_pc (0x00000000004004c6) >>> DW_AT_high_pc (0x00000000004004d0) >>> DW_AT_call_file >>> ("/home/alok/openllvm/llvm-project_derefval/build.d/david.cc") >>> DW_AT_call_line (10) >>> DW_AT_call_column (0x03) >>> >>> 0x00000088: DW_TAG_formal_parameter >>> DW_AT_location (indexed (0x0) loclist >>> 0x00000010: >>> [0x00000000004004c6, 0x00000000004004d4): >>> DW_OP_explicit_pointer, DW_OP_lit3) >>> DW_AT_abstract_origin (0x00000055 "p") >>> ------------ >>> >>> Please note that DW_OP_explicit_pointer denotes that following value >>> represents de-referenced value of optimized out pointer. With necessary >>> changes in LLDB debugger this dwarf info can help to detect the explicit >>> de-referenced value of 'p'. >>> >>> Hi David, >>> >>> Should we keep on working for the above case separately and resume the >>> review of implicit pointer independently now, which is updated with many >>> suggestions from this discussion? >>> >>> Regards, >>> Alok >>> >>> >>> On Wed, Nov 20, 2019 at 11:24 PM Jeremy Morse < >>> jeremy.morse.llvm at gmail.com> wrote: >>> >>>> Hi, >>>> >>>> For a new way of representing things, >>>> >>>> Adrian wrote: >>>> > llvm.dbg.value_new(DILocalVariable("y"), >>>> DIExpression(DW_OP_LLVM_arg0, DW_OP_LLVM_arg1, DW_OP_plus), >>>> > %ptr, %ofs) >>>> >>>> I think this would be great -- there're definitely some constructs >>>> created by the induction-variables pass and similar where one could >>>> recover an implicit variable value, if you could for example subtract >>>> one pointer from another. >>>> >>>> With the current model of storing DIExpressions as a vector of >>>> opcodes, it might become a pain to salvage a Value that gets optimised >>>> out --in the example, if %ofs were salvaged, presumably >>>> DW_OP_LLVM_arg1 could have to be replaced with several extra >>>> operations. This isn't insurmountable, but I've repeatedly shied away >>>> from scanning through DIExpressions to patch them up. A vector of >>>> opcodes is the final output of the compiler, IMHO richer metadata >>>> should be used in the meantime. >>>> >>>> IMHO the implicit pointer work doesn't need to block on this. As said >>>> my mild preference would be for a new intrinsic for this form of >>>> variable location. >>>> >>>> ~ >>>> >>>> Inre PR37682, >>>> >>>> > I’ve been reminded of PR37682, where a function with a reference >>>> parameter might spend all its time computing the “referenced” value in a >>>> temp, and only move the final value back to the referenced object at the >>>> end. This is clearly a situation that could benefit from >>>> DW_OP_implicit_pointer, and there is really no other-object DIE for it to >>>> refer to. Given the current spec, the compiler would need to produce a >>>> DW_TAG_dwarf_procedure for the parameter DIE to refer to. Appendix D >>>> (Figure D.61) has an example of this construction, although it’s a more >>>> contrived source example. >>>> >>>> This has been working through my mind too, and I think it's slightly >>>> different to what implicit_pointer is trying to achieve. In the case >>>> implicit_pointer is designed for, it's a strict improvement in debug >>>> experience because you're recovering information that couldn't be >>>> expressed. However for PR37682 it's a trade-off between whether the >>>> user might want to examine the pointer, or the pointed-at integer: >>>> AFAIUI, we can only express one of the two, not both. Wheras for >>>> mem2reg'd variables referred to by DIE, there is never a pointer to be >>>> lost. >>>> >>>> I think my preference would always be to see temporarily-promoted >>>> values as there's no other way of observing them, but others might >>>> disagree. >>>> >>>> -- >>>> Thanks, >>>> Jeremy >>>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191211/cddc9cb2/attachment.html>
David Blaikie via llvm-dev
2019-Dec-18 23:24 UTC
[llvm-dev] DW_OP_implicit_pointer design/implementation in general
(I'm still pretty concerned that there are IR changes going in for a feature that seems incomplete and more invasive than really seems justified to me - though I admit I'm clearly not paying enough attention to this feature to have a nuanced/fully informed opinion & so maybe I just need to step back from all of this - but given the addition of new intrinsics, it seems like there should be more clear design discussion) On Tue, Dec 10, 2019 at 9:06 PM Alok Sharma <aloksharma.knit at gmail.com> wrote:> Hi David, > > This is regarding missing multilevel handling in branch for explicit > pointers. > > > * does the proposed IR format support multiple layers of dereference > (eg: int ** where we know it ultimately points to the value 3 but can't > describe either the first or second level pointers that get to that value) > - it sounds like any intrinsic that's special cased to deref (like > llvm.dbg.derefval) wouldn't be able to capture that, which seems like it's > overly narrow/special case, then? > > The PoC of DW_OP_LLVM_explicit_pointer does not have handling of > multilevel indirection. As of now it is so due to below reason. > > Explicit pointer handles cases when variable points to a temporary which > contains constant. Due to language standard constraints, we don't find > pointers in such cases, what we get is references. Unlike pointers, > references have single level. (reference to reference is just reference > while pointer to pointer is double pointer). >Case of reference to reference, second level can be handled using> DW_OP_LLVM_explicit_pointer itself. > Case of pointer to reference, second level can be handled using > DW_OP_implicit_pointer. > > Though it would not be complex to make explicit pointer multilevel, I > avoided so due to lack of use case. Please let me know if I am missing > something. >Sorry, I couldn't understand your language related to references and pointers - I don't understand why they would be handled differently or represent challenges/tradeoffs for features related to collapsed indirection like this. Multi-level indirection seems to have as much use as single level indirection. (if a DWARF user may want to know what a pointer points to even when what it points to isn't in memory, the same would hold true for pointers to pointers, etc) I would expect this to be handled with a general OP saying "hey, I'm skipping one level of indirection indirection in the resulting value, because that indirection is missing/not in the final program" and that this would be encoded in a llvm.dbg.value/DIExpression as usual, without the need for new IR intrinsics, though possibly with the need for an LLVM extension DWARF OP (DW_OP_LLVM_explicit_pointer?) To reconstitute that general form into the current DWARF limited "indirection needs to refer to another variable DIE" issue - as I think Paul speculated previously, we could always reconstitute a synthetic variable DIE & not try to reflect the case where the indirection lands at another named/known variable - as I expect that's the minority case. In most cases in C++ I expect pointers and references do not refer to named variables in the same function. They refer to return values from functions, they refer to array elements in dynamically allocated arrays, etc, etc.> > Regards, > Alok > > > On Fri, Nov 29, 2019 at 10:12 AM Alok Sharma <aloksharma.knit at gmail.com> > wrote: > >> Let me try to summarize the implementation first. >> >> At the moment, there are two branches. >> >> 1. When an existing variable is optimized out and that variable is used >> to get the de-refereced value, pointed to by another pointer/reference >> variable. >> Such cases are being addressed using Dwarf expression >> DW_OP_implicit_pointer as de-referenced value of a pointer can be seen >> implicitly (using another variable). Before Dwarf is dumped in LLVM IR, we >> represent it using dbg.derefval (which denotes derefereced value of pointer >> or reference) and DW_OP_LLVM_implicit_pointer operation. >> >> 2. When a temporary variable is optimized out and that variable is used >> to get de-referenced value of another reference variable (AFAIK it can not >> be reproduced with pointers) >> Such cases are being addressed using new Dwarf expression >> DW_OP_explicit_pointer as de-referenced value can be displayed explicitly >> (in place). In LLVM IR, we represent it using dbg.derefval and >> DW_OP_LLVM_explicit_pointer operation. >> >> Both of these two branches have some common implementation to define new >> operations (Dwarf and IR). (D70642, D70643, D69999, D69886). >> First branch has additional patches (D70260, 70384, D70385, D70419). >> Second branch has additional patch ( D70833). >> >> Let me try to comment on points raised by you. >> - Branch 2, (patch D70833) handles cases when temporaries (not existing >> variables) are optimized out. >> - In patch D70385, I have included test points to display that multi >> layered pointers are working >> (llvm/test/DebugInfo/dwarfdump-implicit_pointer_mem2reg.c). >> >> I feel that review of branch 1 (implicit pointer) can be resumed (which >> was halted due to current discussion), while we can continue to discuss >> branch 2 (explicit pointers D7083) if you want. David, what do you think? >> >> Regards, >> Alok >> >> On Fri, Nov 29, 2019 at 4:40 AM David Blaikie <dblaikie at gmail.com> wrote: >> >>> Sorry I haven't been more engaged with this thread, I have been reading >>> it, so hopefully my reply isn't completely out of line/irrelevant - but I >>> still feel like having a custom dwarf expression operator (& no new >>> intrinsics), like we have for one or two other DW_OP_LLVM_* (that aren't >>> actually generated into the DWARF - though this one perhaps could be in >>> some/all cases as an extension, maybe - or a synthesized variable could be >>> created for compatibility with the current DWARF standard) would make the >>> most sense. >>> >>> Some thought experiments that I think are relevant: >>> * does the proposed IR format scale to pointers that don't point to >>> existing variables (that I think has already been touched on in this thread) >>> * does the proposed IR format support multiple layers of dereference >>> (eg: int ** where we know it ultimately points to the value 3 but can't >>> describe either the first or second level pointers that get to that value) >>> - it sounds like any intrinsic that's special cased to deref (like >>> llvm.dbg.derefval) wouldn't be able to capture that, which seems like it's >>> overly narrow/special case, then? >>> >>> On Thu, Nov 28, 2019 at 2:29 PM Alok Sharma via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>>> Hi folks, >>>> >>>> I am pushing a PoC patch https://reviews.llvm.org/D70833 for review >>>> which includes the case when temporary is promoted. >>>> >>>> For such cases it generates IR as >>>> >>>> call void @llvm.dbg.derefval(metadata i32 3, metadata !25, metadata >>>> !DIExpression(DW_OP_LLVM_explicit_pointer, DW_OP_LLVM_arg0)), !dbg !32 >>>> >>>> And llvm-darfdump output looks like >>>> >>>> ------------- >>>> 0x0000007b: DW_TAG_inlined_subroutine >>>> DW_AT_abstract_origin (0x0000004f "_Z4sinkRKi") >>>> DW_AT_low_pc (0x00000000004004c6) >>>> DW_AT_high_pc (0x00000000004004d0) >>>> DW_AT_call_file >>>> ("/home/alok/openllvm/llvm-project_derefval/build.d/david.cc") >>>> DW_AT_call_line (10) >>>> DW_AT_call_column (0x03) >>>> >>>> 0x00000088: DW_TAG_formal_parameter >>>> DW_AT_location (indexed (0x0) loclist >>>> 0x00000010: >>>> [0x00000000004004c6, 0x00000000004004d4): >>>> DW_OP_explicit_pointer, DW_OP_lit3) >>>> DW_AT_abstract_origin (0x00000055 "p") >>>> ------------ >>>> >>>> Please note that DW_OP_explicit_pointer denotes that following value >>>> represents de-referenced value of optimized out pointer. With necessary >>>> changes in LLDB debugger this dwarf info can help to detect the explicit >>>> de-referenced value of 'p'. >>>> >>>> Hi David, >>>> >>>> Should we keep on working for the above case separately and resume the >>>> review of implicit pointer independently now, which is updated with many >>>> suggestions from this discussion? >>>> >>>> Regards, >>>> Alok >>>> >>>> >>>> On Wed, Nov 20, 2019 at 11:24 PM Jeremy Morse < >>>> jeremy.morse.llvm at gmail.com> wrote: >>>> >>>>> Hi, >>>>> >>>>> For a new way of representing things, >>>>> >>>>> Adrian wrote: >>>>> > llvm.dbg.value_new(DILocalVariable("y"), >>>>> DIExpression(DW_OP_LLVM_arg0, DW_OP_LLVM_arg1, DW_OP_plus), >>>>> > %ptr, %ofs) >>>>> >>>>> I think this would be great -- there're definitely some constructs >>>>> created by the induction-variables pass and similar where one could >>>>> recover an implicit variable value, if you could for example subtract >>>>> one pointer from another. >>>>> >>>>> With the current model of storing DIExpressions as a vector of >>>>> opcodes, it might become a pain to salvage a Value that gets optimised >>>>> out --in the example, if %ofs were salvaged, presumably >>>>> DW_OP_LLVM_arg1 could have to be replaced with several extra >>>>> operations. This isn't insurmountable, but I've repeatedly shied away >>>>> from scanning through DIExpressions to patch them up. A vector of >>>>> opcodes is the final output of the compiler, IMHO richer metadata >>>>> should be used in the meantime. >>>>> >>>>> IMHO the implicit pointer work doesn't need to block on this. As said >>>>> my mild preference would be for a new intrinsic for this form of >>>>> variable location. >>>>> >>>>> ~ >>>>> >>>>> Inre PR37682, >>>>> >>>>> > I’ve been reminded of PR37682, where a function with a reference >>>>> parameter might spend all its time computing the “referenced” value in a >>>>> temp, and only move the final value back to the referenced object at the >>>>> end. This is clearly a situation that could benefit from >>>>> DW_OP_implicit_pointer, and there is really no other-object DIE for it to >>>>> refer to. Given the current spec, the compiler would need to produce a >>>>> DW_TAG_dwarf_procedure for the parameter DIE to refer to. Appendix D >>>>> (Figure D.61) has an example of this construction, although it’s a more >>>>> contrived source example. >>>>> >>>>> This has been working through my mind too, and I think it's slightly >>>>> different to what implicit_pointer is trying to achieve. In the case >>>>> implicit_pointer is designed for, it's a strict improvement in debug >>>>> experience because you're recovering information that couldn't be >>>>> expressed. However for PR37682 it's a trade-off between whether the >>>>> user might want to examine the pointer, or the pointed-at integer: >>>>> AFAIUI, we can only express one of the two, not both. Wheras for >>>>> mem2reg'd variables referred to by DIE, there is never a pointer to be >>>>> lost. >>>>> >>>>> I think my preference would always be to see temporarily-promoted >>>>> values as there's no other way of observing them, but others might >>>>> disagree. >>>>> >>>>> -- >>>>> Thanks, >>>>> Jeremy >>>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191218/a6d02451/attachment.html>
David Blaikie via llvm-dev
2019-Dec-24 20:59 UTC
[llvm-dev] DW_OP_implicit_pointer design/implementation in general
(sorry, I accidentally dropped everyone/and the list from the thread, adding them back in here) OK, so we're on the same page, none of this has anything to do with references specifically, but about whether the pointer or reference points/refers to a named variable or not. And this was the point I was trying to make (perhaps poorly) at the start: The DWARF feature (OP_implicit_pointer) is both awkward to implement (seen by the fact that there's discussion to add new intrinsics to support it) and only supports a small subset of cases. My argument is that we should not implement this DWARF feature - or, at least, not the way we're doing it. We should implement a more general feature (currently titled OP_LLVM_explicit_pointer - though I'm not sure "explicit" v "implicit" is expressing, at least to me, the distinction between these two things) & only that, at least only that at the LLVM IR level. Possibly we can implement DWARFv5 standard support (perhaps this is the only DWARF emission mode we support) where the OP_LLVM_explicit_pointer is lowered to an artificial variable (doesn't need a name, file/line number, etc - it's just a big indirection due to the way DWARFv5 implicit_pointer is specified) with the location attached to it. We lose some functionality by doing this, for sure - the consumer won't know that the pointer aliases a variable, though GDB at least doesn't visualize the name of the variable a pointer points to so far as I can tell, for example (Sony/Apple - do you have debuggers that would have particularly improved UI were you to know that a pointer points to a particular variable, rather than to the pieces of the variable's value (assuming the variable isn't completely memory at all - so it's up in registers, incomplete/fragmented hunks of memory, etc)) I think implementing /only/ this more general solution is tidier (doesn't introduce new LLVM IR intrinsics & the complexities of tracking which variable another variable points to) and covers more cases. If at some point in the future someone finds that the value on top of this, of knowing the name of the variable being pointed to, is worth the added complexity - then we could do that. But my gut feel is that it is not worth the added complexity. On Mon, Dec 23, 2019 at 11:05 PM Alok Sharma <aloksharma.knit at gmail.com> wrote:> > Hmm. I don't understand why references would be handled differently than > pointers. Could you explain further? References and pointers can point to > unnamed entities (things without a variable DIE), so if that's the > distinction being drawn, it doesn't sound like the right one. > > As in example above pointers can be of multi-level (int *, int **, int > ***) but references remain single level (int &). > The difference of handling both "DW_OP_implicit_pointer" and > "DW_OP_LLVM_explicit_pointer" is not based on pointer or reference. It is > based on how their values can be represented. > > A)- Use implicit_pointer, when value can be found in another DIE (it can > be case of pointer or reference and there can be multilevel of indirection) > B)- Use explicit_pointer, when value can be found in-place without > redirecting it to another DIE (the case of pointee is temporary, I could > not find example of multi-level of redirection with all intermediate levels > being temp) > > For example, below test case for reference uses implicit_pointer > > ----------------------------------------------- > volatile int gvar = 7; > > int func(int &ref) { > gvar = ref; > return ref + 5; > } > > int main() { > int var = 4; > int &refVar = var; > > int res = func(refVar); > > return res; > } > ----------------------------------------------- > > This is the case of reference using DW_OP_implicit_pointer > ----------------------------------------------- > 0x00000076: DW_TAG_variable > DW_AT_const_value (4) > DW_AT_name ("var") > DW_AT_decl_file ("ref.cc") > DW_AT_decl_line (9) > DW_AT_type (0x00000037 "int") > > 0x0000007f: DW_TAG_variable > DW_AT_location (indexed (0x1) loclist > 0x00000023: > [0x0000000000400490, 0x00000000004004a0): > DW_OP_implicit_pointer 0x76 +0) <----- refVar points to var > DW_AT_name ("refVar") > DW_AT_decl_file ("ref.cc") > DW_AT_decl_line (10) > DW_AT_type (0x00000062 "int&") > > 0x00000088: DW_TAG_variable > DW_AT_location (indexed (0x2) loclist > 0x0000002e: > [0x000000000040049a, 0x00000000004004a0): > DW_OP_consts +9, DW_OP_stack_value) > DW_AT_name ("res") > DW_AT_decl_file ("ref.cc") > DW_AT_decl_line (12) > DW_AT_type (0x00000037 "int") > > 0x00000091: DW_TAG_inlined_subroutine > DW_AT_abstract_origin (0x0000004f "_Z4funcRi") > DW_AT_low_pc (0x0000000000400490) > DW_AT_high_pc (0x000000000040049a) > DW_AT_call_file ("ref.cc") > DW_AT_call_line (12) > DW_AT_call_column (0x0d) > > 0x0000009e: DW_TAG_formal_parameter > DW_AT_location (indexed (0x0) loclist > 0x00000018: > [0x0000000000400490, 0x00000000004004a0): > DW_OP_implicit_pointer 0x76 +0) <---------- ref points to var > DW_AT_abstract_origin (0x00000059 "ref") > ----------------------------------------------- > > > > > > Could you show an example of the DWARF using DW_OP_implicit_pointer for > multiple levels of indirection, > > Let me present a part from the test case > test/DebugInfo/implicit_pointer_mem2reg.c > ------------------------------------------------ > # cat multilevel_pointer.c > static const char *b = "opq"; > volatile int v; > int main() { > int var1 = 4; > int *ptr1; > int **ptrptr1; > > v++; > ptr1 = &var1; > ptrptr1 = &ptr1; > v++; > > return *ptr1 + **ptrptr1 - 5; > } > ------------------------------------------------ > With the the current set of patches it produces DWARF as below > > ----------------------------------------------- > # llvm-dwarfdump multilevel_pointer > 0x0000004a: DW_TAG_variable > DW_AT_const_value (4) > DW_AT_name ("var1") > DW_AT_decl_file > ("/home/alok/openllvm/llvm-project_derefval/build.d/multilevel_pointer.c") > DW_AT_decl_line (14) > DW_AT_type (0x00000037 "int") > > 0x00000053: DW_TAG_variable > DW_AT_location (indexed (0x0) loclist > 0x00000014: > [0x0000000000400487, 0x0000000000400494): > DW_OP_implicit_pointer 0x5c +0) > <----------------------- ptrptr1 points to ptr1 > DW_AT_name ("ptrptr1") > DW_AT_decl_file > ("/home/alok/openllvm/llvm-project_derefval/build.d/multilevel_pointer.c") > DW_AT_decl_line (16) > DW_AT_type (0x00000066 "int**") > > 0x0000005c: DW_TAG_variable > DW_AT_location (indexed (0x1) loclist > 0x0000001f: > [0x0000000000400487, 0x0000000000400494): > DW_OP_implicit_pointer 0x4a +0) > <---------------------- ptr1 points to var1 > DW_AT_name ("ptr1") > DW_AT_decl_file > ("/home/alok/openllvm/llvm-project_derefval/build.d/multilevel_pointer.c") > DW_AT_decl_line (15) > DW_AT_type (0x0000006b "int*") > ------------------------------------------- > > > , and for cases where the thing being pointed to has no variable DIE? > (eg: dynamically allocated objects ("int *i = new int();" but then the > compiler optimizes away as in this code (compiled at -O3): > > > For test case provided by you earlier > -------------------------------- > __attribute__((optnone)) int source() { > return 3; > } > __attribute__((optnone)) void f(int i) { > } > inline void sink(const int& p) { > f(p); > } > int main() { > sink(source()); > } > -------------------------------- > > Since the pointee is temporary, the dereference value can be displayed > in-place using DW_OP_LLVM_explicit_pointer. > > -------------------------------- > 0x0000006c: DW_TAG_subprogram > DW_AT_low_pc (0x00000000004004c0) > DW_AT_high_pc (0x00000000004004d4) > DW_AT_frame_base (DW_OP_reg7 RSP) > DW_AT_call_all_calls (true) > DW_AT_name ("main") > DW_AT_decl_file ("explicit_pointer.cc") > DW_AT_decl_line (14) > DW_AT_type (0x00000068 "int") > DW_AT_external (true) > > 0x0000007b: DW_TAG_inlined_subroutine > DW_AT_abstract_origin (0x0000004f "_Z4sinkRKi") > DW_AT_low_pc (0x00000000004004c6) > DW_AT_high_pc (0x00000000004004d0) > DW_AT_call_file ("explicit_pointer.cc") > DW_AT_call_line (15) > DW_AT_call_column (0x03) > > 0x00000088: DW_TAG_formal_parameter > DW_AT_location (indexed (0x0) loclist > 0x00000010: > [0x00000000004004c6, 0x00000000004004d4): > DW_OP_LLVM_explicit_pointer, DW_OP_lit3) <---- in-place > representation of value *p=3 > DW_AT_abstract_origin (0x00000055 "p") > > ------------------------------------------------- > Another case you just shared can be represented using " > DW_OP_LLVM_explicit_pointer", it can be brought under scope. > > Since these cases are *aspire* (gcc also doesnt have these cases in-scope) > cases and can be solved separately. > > Regards, > Alok > > On Tue, Dec 24, 2019 at 12:52 AM David Blaikie <dblaikie at gmail.com> wrote: > >> >> >> On Mon, Dec 23, 2019 at 10:23 AM Alok Sharma <aloksharma.knit at gmail.com> >> wrote: >> >>> Hi David, >>> >>> > Sorry, I couldn't understand your language related to references and >>> pointers - I don't understand why they would be handled differently or >>> represent challenges/tradeoffs for features related to collapsed >>> indirection like this. >>> >>> Let me try to explain what I wanted to convey with an example. >>> >>> Example of multilevel pointer: >>> >>> int var; >>> int *ptr = &var; // first level of indirection >>> int *ptrptr = &ptr; //second level of indirection >>> >>> Example of multilevel references: >>> >>> int var; >>> int &ref = var; // first level of reference >>> int &refref = ref; // second level of reference >>> >>> Though variable refref is reference of another reference but that is >>> still of type reference. >>> >>> As I earlier said I am struggling to find a case where multilevel of >>> indirection is needed with DW_OP_LLVM_explicit_pointer) in case of >>> *references*, please let me know if you have any example in mind. I shall >>> modify the patch for multilevel of indirection. ( >>> DW_OP_LLVM_explicit_pointer is used only in case of references) >>> >> >> Hmm. I don't understand why references would be handled differently than >> pointers. Could you explain further? References and pointers can point to >> unnamed entities (things without a variable DIE), so if that's the >> distinction being drawn, it doesn't sound like the right one. >> >> >>> > Multi-level indirection seems to have as much use as single level >>> indirection. (if a DWARF user may want to know what a pointer points to >>> even when what it points to isn't in memory, the same would hold true for >>> pointers to pointers, etc) >>> >>> For pointer to pointer, multilevel indirection is already handled. As >>> all those cases use DW_OP_implicit_pointer. >>> >> >> Could you show an example of the DWARF using DW_OP_implicit_pointer for >> multiple levels of indirection, and for cases where the thing being pointed >> to has no variable DIE? (eg: dynamically allocated objects ("int *i = new >> int();" but then the compiler optimizes away as in this code (compiled at >> -O3): >> void f(int); >> void f2() { >> int *i = new int(3); >> f(*i); >> delete i; >> } >> >>> >>> Regards, >>> Alok >>> >>> >>> >>> On Thu, Dec 19, 2019 at 4:54 AM David Blaikie <dblaikie at gmail.com> >>> wrote: >>> >>>> (I'm still pretty concerned that there are IR changes going in for a >>>> feature that seems incomplete and more invasive than really seems justified >>>> to me - though I admit I'm clearly not paying enough attention to this >>>> feature to have a nuanced/fully informed opinion & so maybe I just need to >>>> step back from all of this - but given the addition of new intrinsics, it >>>> seems like there should be more clear design discussion) >>>> >>>> On Tue, Dec 10, 2019 at 9:06 PM Alok Sharma <aloksharma.knit at gmail.com> >>>> wrote: >>>> >>>>> Hi David, >>>>> >>>>> This is regarding missing multilevel handling in branch for explicit >>>>> pointers. >>>>> >>>>> > * does the proposed IR format support multiple layers of dereference >>>>> (eg: int ** where we know it ultimately points to the value 3 but can't >>>>> describe either the first or second level pointers that get to that value) >>>>> - it sounds like any intrinsic that's special cased to deref (like >>>>> llvm.dbg.derefval) wouldn't be able to capture that, which seems like it's >>>>> overly narrow/special case, then? >>>>> >>>>> The PoC of DW_OP_LLVM_explicit_pointer does not have handling of >>>>> multilevel indirection. As of now it is so due to below reason. >>>>> >>>>> Explicit pointer handles cases when variable points to a temporary >>>>> which contains constant. Due to language standard constraints, we don't >>>>> find pointers in such cases, what we get is references. Unlike pointers, >>>>> references have single level. (reference to reference is just reference >>>>> while pointer to pointer is double pointer). >>>>> >>>> Case of reference to reference, second level can be handled using >>>>> DW_OP_LLVM_explicit_pointer itself. >>>>> Case of pointer to reference, second level can be handled using >>>>> DW_OP_implicit_pointer. >>>>> >>>>> Though it would not be complex to make explicit pointer multilevel, I >>>>> avoided so due to lack of use case. Please let me know if I am missing >>>>> something. >>>>> >>>> >>>> Sorry, I couldn't understand your language related to references and >>>> pointers - I don't understand why they would be handled differently or >>>> represent challenges/tradeoffs for features related to collapsed >>>> indirection like this. >>>> >>>> Multi-level indirection seems to have as much use as single level >>>> indirection. (if a DWARF user may want to know what a pointer points to >>>> even when what it points to isn't in memory, the same would hold true for >>>> pointers to pointers, etc) >>>> >>>> I would expect this to be handled with a general OP saying "hey, I'm >>>> skipping one level of indirection indirection in the resulting value, >>>> because that indirection is missing/not in the final program" and that this >>>> would be encoded in a llvm.dbg.value/DIExpression as usual, without the >>>> need for new IR intrinsics, though possibly with the need for an LLVM >>>> extension DWARF OP (DW_OP_LLVM_explicit_pointer?) >>>> >>>> To reconstitute that general form into the current DWARF limited >>>> "indirection needs to refer to another variable DIE" issue - as I think >>>> Paul speculated previously, we could always reconstitute a synthetic >>>> variable DIE & not try to reflect the case where the indirection lands at >>>> another named/known variable - as I expect that's the minority case. In >>>> most cases in C++ I expect pointers and references do not refer to named >>>> variables in the same function. They refer to return values from functions, >>>> they refer to array elements in dynamically allocated arrays, etc, etc. >>>> >>>> >>>>> >>>>> Regards, >>>>> Alok >>>>> >>>>> >>>>> On Fri, Nov 29, 2019 at 10:12 AM Alok Sharma < >>>>> aloksharma.knit at gmail.com> wrote: >>>>> >>>>>> Let me try to summarize the implementation first. >>>>>> >>>>>> At the moment, there are two branches. >>>>>> >>>>>> 1. When an existing variable is optimized out and that variable is >>>>>> used to get the de-refereced value, pointed to by another pointer/reference >>>>>> variable. >>>>>> Such cases are being addressed using Dwarf expression >>>>>> DW_OP_implicit_pointer as de-referenced value of a pointer can be seen >>>>>> implicitly (using another variable). Before Dwarf is dumped in LLVM IR, we >>>>>> represent it using dbg.derefval (which denotes derefereced value of pointer >>>>>> or reference) and DW_OP_LLVM_implicit_pointer operation. >>>>>> >>>>>> 2. When a temporary variable is optimized out and that variable is >>>>>> used to get de-referenced value of another reference variable (AFAIK it can >>>>>> not be reproduced with pointers) >>>>>> Such cases are being addressed using new Dwarf expression >>>>>> DW_OP_explicit_pointer as de-referenced value can be displayed explicitly >>>>>> (in place). In LLVM IR, we represent it using dbg.derefval and >>>>>> DW_OP_LLVM_explicit_pointer operation. >>>>>> >>>>>> Both of these two branches have some common implementation to define >>>>>> new operations (Dwarf and IR). (D70642, D70643, D69999, D69886). >>>>>> First branch has additional patches (D70260, 70384, D70385, D70419). >>>>>> Second branch has additional patch ( D70833). >>>>>> >>>>>> Let me try to comment on points raised by you. >>>>>> - Branch 2, (patch D70833) handles cases when temporaries (not >>>>>> existing variables) are optimized out. >>>>>> - In patch D70385, I have included test points to display that multi >>>>>> layered pointers are working >>>>>> (llvm/test/DebugInfo/dwarfdump-implicit_pointer_mem2reg.c). >>>>>> >>>>>> I feel that review of branch 1 (implicit pointer) can be resumed >>>>>> (which was halted due to current discussion), while we can continue to >>>>>> discuss branch 2 (explicit pointers D7083) if you want. David, what do you >>>>>> think? >>>>>> >>>>>> Regards, >>>>>> Alok >>>>>> >>>>>> On Fri, Nov 29, 2019 at 4:40 AM David Blaikie <dblaikie at gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Sorry I haven't been more engaged with this thread, I have been >>>>>>> reading it, so hopefully my reply isn't completely out of line/irrelevant - >>>>>>> but I still feel like having a custom dwarf expression operator (& no new >>>>>>> intrinsics), like we have for one or two other DW_OP_LLVM_* (that aren't >>>>>>> actually generated into the DWARF - though this one perhaps could be in >>>>>>> some/all cases as an extension, maybe - or a synthesized variable could be >>>>>>> created for compatibility with the current DWARF standard) would make the >>>>>>> most sense. >>>>>>> >>>>>>> Some thought experiments that I think are relevant: >>>>>>> * does the proposed IR format scale to pointers that don't point to >>>>>>> existing variables (that I think has already been touched on in this thread) >>>>>>> * does the proposed IR format support multiple layers of dereference >>>>>>> (eg: int ** where we know it ultimately points to the value 3 but can't >>>>>>> describe either the first or second level pointers that get to that value) >>>>>>> - it sounds like any intrinsic that's special cased to deref (like >>>>>>> llvm.dbg.derefval) wouldn't be able to capture that, which seems like it's >>>>>>> overly narrow/special case, then? >>>>>>> >>>>>>> On Thu, Nov 28, 2019 at 2:29 PM Alok Sharma via llvm-dev < >>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>> >>>>>>>> Hi folks, >>>>>>>> >>>>>>>> I am pushing a PoC patch https://reviews.llvm.org/D70833 for >>>>>>>> review which includes the case when temporary is promoted. >>>>>>>> >>>>>>>> For such cases it generates IR as >>>>>>>> >>>>>>>> call void @llvm.dbg.derefval(metadata i32 3, metadata !25, >>>>>>>> metadata !DIExpression(DW_OP_LLVM_explicit_pointer, DW_OP_LLVM_arg0)), !dbg >>>>>>>> !32 >>>>>>>> >>>>>>>> And llvm-darfdump output looks like >>>>>>>> >>>>>>>> ------------- >>>>>>>> 0x0000007b: DW_TAG_inlined_subroutine >>>>>>>> DW_AT_abstract_origin (0x0000004f "_Z4sinkRKi") >>>>>>>> DW_AT_low_pc (0x00000000004004c6) >>>>>>>> DW_AT_high_pc (0x00000000004004d0) >>>>>>>> DW_AT_call_file >>>>>>>> ("/home/alok/openllvm/llvm-project_derefval/build.d/david.cc") >>>>>>>> DW_AT_call_line (10) >>>>>>>> DW_AT_call_column (0x03) >>>>>>>> >>>>>>>> 0x00000088: DW_TAG_formal_parameter >>>>>>>> DW_AT_location (indexed (0x0) loclist >>>>>>>> 0x00000010: >>>>>>>> [0x00000000004004c6, 0x00000000004004d4): >>>>>>>> DW_OP_explicit_pointer, DW_OP_lit3) >>>>>>>> DW_AT_abstract_origin (0x00000055 "p") >>>>>>>> ------------ >>>>>>>> >>>>>>>> Please note that DW_OP_explicit_pointer denotes that following >>>>>>>> value represents de-referenced value of optimized out pointer. With >>>>>>>> necessary changes in LLDB debugger this dwarf info can help to detect the >>>>>>>> explicit de-referenced value of 'p'. >>>>>>>> >>>>>>>> Hi David, >>>>>>>> >>>>>>>> Should we keep on working for the above case separately and resume >>>>>>>> the review of implicit pointer independently now, which is updated with >>>>>>>> many suggestions from this discussion? >>>>>>>> >>>>>>>> Regards, >>>>>>>> Alok >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Nov 20, 2019 at 11:24 PM Jeremy Morse < >>>>>>>> jeremy.morse.llvm at gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> For a new way of representing things, >>>>>>>>> >>>>>>>>> Adrian wrote: >>>>>>>>> > llvm.dbg.value_new(DILocalVariable("y"), >>>>>>>>> DIExpression(DW_OP_LLVM_arg0, DW_OP_LLVM_arg1, DW_OP_plus), >>>>>>>>> > %ptr, %ofs) >>>>>>>>> >>>>>>>>> I think this would be great -- there're definitely some constructs >>>>>>>>> created by the induction-variables pass and similar where one could >>>>>>>>> recover an implicit variable value, if you could for example >>>>>>>>> subtract >>>>>>>>> one pointer from another. >>>>>>>>> >>>>>>>>> With the current model of storing DIExpressions as a vector of >>>>>>>>> opcodes, it might become a pain to salvage a Value that gets >>>>>>>>> optimised >>>>>>>>> out --in the example, if %ofs were salvaged, presumably >>>>>>>>> DW_OP_LLVM_arg1 could have to be replaced with several extra >>>>>>>>> operations. This isn't insurmountable, but I've repeatedly shied >>>>>>>>> away >>>>>>>>> from scanning through DIExpressions to patch them up. A vector of >>>>>>>>> opcodes is the final output of the compiler, IMHO richer metadata >>>>>>>>> should be used in the meantime. >>>>>>>>> >>>>>>>>> IMHO the implicit pointer work doesn't need to block on this. As >>>>>>>>> said >>>>>>>>> my mild preference would be for a new intrinsic for this form of >>>>>>>>> variable location. >>>>>>>>> >>>>>>>>> ~ >>>>>>>>> >>>>>>>>> Inre PR37682, >>>>>>>>> >>>>>>>>> > I’ve been reminded of PR37682, where a function with a reference >>>>>>>>> parameter might spend all its time computing the “referenced” value in a >>>>>>>>> temp, and only move the final value back to the referenced object at the >>>>>>>>> end. This is clearly a situation that could benefit from >>>>>>>>> DW_OP_implicit_pointer, and there is really no other-object DIE for it to >>>>>>>>> refer to. Given the current spec, the compiler would need to produce a >>>>>>>>> DW_TAG_dwarf_procedure for the parameter DIE to refer to. Appendix D >>>>>>>>> (Figure D.61) has an example of this construction, although it’s a more >>>>>>>>> contrived source example. >>>>>>>>> >>>>>>>>> This has been working through my mind too, and I think it's >>>>>>>>> slightly >>>>>>>>> different to what implicit_pointer is trying to achieve. In the >>>>>>>>> case >>>>>>>>> implicit_pointer is designed for, it's a strict improvement in >>>>>>>>> debug >>>>>>>>> experience because you're recovering information that couldn't be >>>>>>>>> expressed. However for PR37682 it's a trade-off between whether the >>>>>>>>> user might want to examine the pointer, or the pointed-at integer: >>>>>>>>> AFAIUI, we can only express one of the two, not both. Wheras for >>>>>>>>> mem2reg'd variables referred to by DIE, there is never a pointer >>>>>>>>> to be >>>>>>>>> lost. >>>>>>>>> >>>>>>>>> I think my preference would always be to see temporarily-promoted >>>>>>>>> values as there's no other way of observing them, but others might >>>>>>>>> disagree. >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Thanks, >>>>>>>>> Jeremy >>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> LLVM Developers mailing list >>>>>>>> llvm-dev at lists.llvm.org >>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>> >>>>>>>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191224/861549cd/attachment.html>
Alok Sharma via llvm-dev
2020-Jan-01 19:04 UTC
[llvm-dev] DW_OP_implicit_pointer design/implementation in general
Hi David, Happy new year ! I just uploaded a POC patch that covers the cases when pointer points to un-named variables using DW_OP_implicit_pointer (references and dynamic allocation). This is using artificial variable as suggested by Paul. https://reviews.llvm.org/D72055 I hope that now it should address your concerns. Scope of DW_OP_implicit_pointer: As we initially decided split of original patch should be splits for back-end changes + splits for different front-end changes to address different scope. the current patch fits in that decision. Now we cover most of the cases if not all. And good thing is that we can add more scope whenever needed. We don't need to stall current set of patches. Regarding addition of new intrinsic: it came up from the current discussion and would benefit us to identify when value of a variable is denoted or de-referenced valued of (pointer/reference) variable is denoted. It has gone through first set of review. Please let me know your thoughts. Regards, Alok On Wed, Dec 25, 2019 at 2:30 AM David Blaikie <dblaikie at gmail.com> wrote:> (sorry, I accidentally dropped everyone/and the list from the thread, > adding them back in here) > > OK, so we're on the same page, none of this has anything to do with > references specifically, but about whether the pointer or reference > points/refers to a named variable or not. > > And this was the point I was trying to make (perhaps poorly) at the start: > The DWARF feature (OP_implicit_pointer) is both awkward to implement (seen > by the fact that there's discussion to add new intrinsics to support it) > and only supports a small subset of cases. > > My argument is that we should not implement this DWARF feature - or, at > least, not the way we're doing it. > > We should implement a more general feature (currently titled > OP_LLVM_explicit_pointer - though I'm not sure "explicit" v "implicit" is > expressing, at least to me, the distinction between these two things) & > only that, at least only that at the LLVM IR level. > > Possibly we can implement DWARFv5 standard support (perhaps this is the > only DWARF emission mode we support) where the OP_LLVM_explicit_pointer is > lowered to an artificial variable (doesn't need a name, file/line number, > etc - it's just a big indirection due to the way DWARFv5 implicit_pointer > is specified) with the location attached to it. > > We lose some functionality by doing this, for sure - the consumer won't > know that the pointer aliases a variable, though GDB at least doesn't > visualize the name of the variable a pointer points to so far as I can > tell, for example (Sony/Apple - do you have debuggers that would have > particularly improved UI were you to know that a pointer points to a > particular variable, rather than to the pieces of the variable's value > (assuming the variable isn't completely memory at all - so it's up in > registers, incomplete/fragmented hunks of memory, etc)) > > I think implementing /only/ this more general solution is tidier (doesn't > introduce new LLVM IR intrinsics & the complexities of tracking which > variable another variable points to) and covers more cases. If at some > point in the future someone finds that the value on top of this, of knowing > the name of the variable being pointed to, is worth the added complexity - > then we could do that. But my gut feel is that it is not worth the added > complexity. > > > > On Mon, Dec 23, 2019 at 11:05 PM Alok Sharma <aloksharma.knit at gmail.com> > wrote: > >> > Hmm. I don't understand why references would be handled differently >> than pointers. Could you explain further? References and pointers can point >> to unnamed entities (things without a variable DIE), so if that's the >> distinction being drawn, it doesn't sound like the right one. >> >> As in example above pointers can be of multi-level (int *, int **, int >> ***) but references remain single level (int &). >> The difference of handling both "DW_OP_implicit_pointer" and >> "DW_OP_LLVM_explicit_pointer" is not based on pointer or reference. It is >> based on how their values can be represented. >> >> A)- Use implicit_pointer, when value can be found in another DIE (it can >> be case of pointer or reference and there can be multilevel of indirection) >> B)- Use explicit_pointer, when value can be found in-place without >> redirecting it to another DIE (the case of pointee is temporary, I could >> not find example of multi-level of redirection with all intermediate levels >> being temp) >> >> For example, below test case for reference uses implicit_pointer >> >> ----------------------------------------------- >> volatile int gvar = 7; >> >> int func(int &ref) { >> gvar = ref; >> return ref + 5; >> } >> >> int main() { >> int var = 4; >> int &refVar = var; >> >> int res = func(refVar); >> >> return res; >> } >> ----------------------------------------------- >> >> This is the case of reference using DW_OP_implicit_pointer >> ----------------------------------------------- >> 0x00000076: DW_TAG_variable >> DW_AT_const_value (4) >> DW_AT_name ("var") >> DW_AT_decl_file ("ref.cc") >> DW_AT_decl_line (9) >> DW_AT_type (0x00000037 "int") >> >> 0x0000007f: DW_TAG_variable >> DW_AT_location (indexed (0x1) loclist >> 0x00000023: >> [0x0000000000400490, 0x00000000004004a0): >> DW_OP_implicit_pointer 0x76 +0) <----- refVar points to var >> DW_AT_name ("refVar") >> DW_AT_decl_file ("ref.cc") >> DW_AT_decl_line (10) >> DW_AT_type (0x00000062 "int&") >> >> 0x00000088: DW_TAG_variable >> DW_AT_location (indexed (0x2) loclist >> 0x0000002e: >> [0x000000000040049a, 0x00000000004004a0): >> DW_OP_consts +9, DW_OP_stack_value) >> DW_AT_name ("res") >> DW_AT_decl_file ("ref.cc") >> DW_AT_decl_line (12) >> DW_AT_type (0x00000037 "int") >> >> 0x00000091: DW_TAG_inlined_subroutine >> DW_AT_abstract_origin (0x0000004f "_Z4funcRi") >> DW_AT_low_pc (0x0000000000400490) >> DW_AT_high_pc (0x000000000040049a) >> DW_AT_call_file ("ref.cc") >> DW_AT_call_line (12) >> DW_AT_call_column (0x0d) >> >> 0x0000009e: DW_TAG_formal_parameter >> DW_AT_location (indexed (0x0) loclist >> 0x00000018: >> [0x0000000000400490, 0x00000000004004a0): >> DW_OP_implicit_pointer 0x76 +0) <---------- ref points to var >> DW_AT_abstract_origin (0x00000059 "ref") >> ----------------------------------------------- >> >> >> >> >> > Could you show an example of the DWARF using DW_OP_implicit_pointer for >> multiple levels of indirection, >> >> Let me present a part from the test case >> test/DebugInfo/implicit_pointer_mem2reg.c >> ------------------------------------------------ >> # cat multilevel_pointer.c >> static const char *b = "opq"; >> volatile int v; >> int main() { >> int var1 = 4; >> int *ptr1; >> int **ptrptr1; >> >> v++; >> ptr1 = &var1; >> ptrptr1 = &ptr1; >> v++; >> >> return *ptr1 + **ptrptr1 - 5; >> } >> ------------------------------------------------ >> With the the current set of patches it produces DWARF as below >> >> ----------------------------------------------- >> # llvm-dwarfdump multilevel_pointer >> 0x0000004a: DW_TAG_variable >> DW_AT_const_value (4) >> DW_AT_name ("var1") >> DW_AT_decl_file >> ("/home/alok/openllvm/llvm-project_derefval/build.d/multilevel_pointer.c") >> DW_AT_decl_line (14) >> DW_AT_type (0x00000037 "int") >> >> 0x00000053: DW_TAG_variable >> DW_AT_location (indexed (0x0) loclist >> 0x00000014: >> [0x0000000000400487, 0x0000000000400494): >> DW_OP_implicit_pointer 0x5c +0) >> <----------------------- ptrptr1 points to ptr1 >> DW_AT_name ("ptrptr1") >> DW_AT_decl_file >> ("/home/alok/openllvm/llvm-project_derefval/build.d/multilevel_pointer.c") >> DW_AT_decl_line (16) >> DW_AT_type (0x00000066 "int**") >> >> 0x0000005c: DW_TAG_variable >> DW_AT_location (indexed (0x1) loclist >> 0x0000001f: >> [0x0000000000400487, 0x0000000000400494): >> DW_OP_implicit_pointer 0x4a +0) >> <---------------------- ptr1 points to var1 >> DW_AT_name ("ptr1") >> DW_AT_decl_file >> ("/home/alok/openllvm/llvm-project_derefval/build.d/multilevel_pointer.c") >> DW_AT_decl_line (15) >> DW_AT_type (0x0000006b "int*") >> ------------------------------------------- >> >> > , and for cases where the thing being pointed to has no variable DIE? >> (eg: dynamically allocated objects ("int *i = new int();" but then the >> compiler optimizes away as in this code (compiled at -O3): >> >> >> For test case provided by you earlier >> -------------------------------- >> __attribute__((optnone)) int source() { >> return 3; >> } >> __attribute__((optnone)) void f(int i) { >> } >> inline void sink(const int& p) { >> f(p); >> } >> int main() { >> sink(source()); >> } >> -------------------------------- >> >> Since the pointee is temporary, the dereference value can be displayed >> in-place using DW_OP_LLVM_explicit_pointer. >> >> -------------------------------- >> 0x0000006c: DW_TAG_subprogram >> DW_AT_low_pc (0x00000000004004c0) >> DW_AT_high_pc (0x00000000004004d4) >> DW_AT_frame_base (DW_OP_reg7 RSP) >> DW_AT_call_all_calls (true) >> DW_AT_name ("main") >> DW_AT_decl_file ("explicit_pointer.cc") >> DW_AT_decl_line (14) >> DW_AT_type (0x00000068 "int") >> DW_AT_external (true) >> >> 0x0000007b: DW_TAG_inlined_subroutine >> DW_AT_abstract_origin (0x0000004f "_Z4sinkRKi") >> DW_AT_low_pc (0x00000000004004c6) >> DW_AT_high_pc (0x00000000004004d0) >> DW_AT_call_file ("explicit_pointer.cc") >> DW_AT_call_line (15) >> DW_AT_call_column (0x03) >> >> 0x00000088: DW_TAG_formal_parameter >> DW_AT_location (indexed (0x0) loclist >> 0x00000010: >> [0x00000000004004c6, 0x00000000004004d4): >> DW_OP_LLVM_explicit_pointer, DW_OP_lit3) <---- in-place >> representation of value *p=3 >> DW_AT_abstract_origin (0x00000055 "p") >> >> ------------------------------------------------- >> Another case you just shared can be represented using " >> DW_OP_LLVM_explicit_pointer", it can be brought under scope. >> >> Since these cases are *aspire* (gcc also doesnt have these cases >> in-scope) cases and can be solved separately. >> >> Regards, >> Alok >> >> On Tue, Dec 24, 2019 at 12:52 AM David Blaikie <dblaikie at gmail.com> >> wrote: >> >>> >>> >>> On Mon, Dec 23, 2019 at 10:23 AM Alok Sharma <aloksharma.knit at gmail.com> >>> wrote: >>> >>>> Hi David, >>>> >>>> > Sorry, I couldn't understand your language related to references and >>>> pointers - I don't understand why they would be handled differently or >>>> represent challenges/tradeoffs for features related to collapsed >>>> indirection like this. >>>> >>>> Let me try to explain what I wanted to convey with an example. >>>> >>>> Example of multilevel pointer: >>>> >>>> int var; >>>> int *ptr = &var; // first level of indirection >>>> int *ptrptr = &ptr; //second level of indirection >>>> >>>> Example of multilevel references: >>>> >>>> int var; >>>> int &ref = var; // first level of reference >>>> int &refref = ref; // second level of reference >>>> >>>> Though variable refref is reference of another reference but that is >>>> still of type reference. >>>> >>>> As I earlier said I am struggling to find a case where multilevel of >>>> indirection is needed with DW_OP_LLVM_explicit_pointer) in case of >>>> *references*, please let me know if you have any example in mind. I shall >>>> modify the patch for multilevel of indirection. ( >>>> DW_OP_LLVM_explicit_pointer is used only in case of references) >>>> >>> >>> Hmm. I don't understand why references would be handled differently than >>> pointers. Could you explain further? References and pointers can point to >>> unnamed entities (things without a variable DIE), so if that's the >>> distinction being drawn, it doesn't sound like the right one. >>> >>> >>>> > Multi-level indirection seems to have as much use as single level >>>> indirection. (if a DWARF user may want to know what a pointer points to >>>> even when what it points to isn't in memory, the same would hold true for >>>> pointers to pointers, etc) >>>> >>>> For pointer to pointer, multilevel indirection is already handled. As >>>> all those cases use DW_OP_implicit_pointer. >>>> >>> >>> Could you show an example of the DWARF using DW_OP_implicit_pointer for >>> multiple levels of indirection, and for cases where the thing being pointed >>> to has no variable DIE? (eg: dynamically allocated objects ("int *i = new >>> int();" but then the compiler optimizes away as in this code (compiled at >>> -O3): >>> void f(int); >>> void f2() { >>> int *i = new int(3); >>> f(*i); >>> delete i; >>> } >>> >>>> >>>> Regards, >>>> Alok >>>> >>>> >>>> >>>> On Thu, Dec 19, 2019 at 4:54 AM David Blaikie <dblaikie at gmail.com> >>>> wrote: >>>> >>>>> (I'm still pretty concerned that there are IR changes going in for a >>>>> feature that seems incomplete and more invasive than really seems justified >>>>> to me - though I admit I'm clearly not paying enough attention to this >>>>> feature to have a nuanced/fully informed opinion & so maybe I just need to >>>>> step back from all of this - but given the addition of new intrinsics, it >>>>> seems like there should be more clear design discussion) >>>>> >>>>> On Tue, Dec 10, 2019 at 9:06 PM Alok Sharma <aloksharma.knit at gmail.com> >>>>> wrote: >>>>> >>>>>> Hi David, >>>>>> >>>>>> This is regarding missing multilevel handling in branch for explicit >>>>>> pointers. >>>>>> >>>>>> > * does the proposed IR format support multiple layers of >>>>>> dereference (eg: int ** where we know it ultimately points to the value 3 >>>>>> but can't describe either the first or second level pointers that get to >>>>>> that value) - it sounds like any intrinsic that's special cased to deref >>>>>> (like llvm.dbg.derefval) wouldn't be able to capture that, which seems like >>>>>> it's overly narrow/special case, then? >>>>>> >>>>>> The PoC of DW_OP_LLVM_explicit_pointer does not have handling of >>>>>> multilevel indirection. As of now it is so due to below reason. >>>>>> >>>>>> Explicit pointer handles cases when variable points to a temporary >>>>>> which contains constant. Due to language standard constraints, we don't >>>>>> find pointers in such cases, what we get is references. Unlike pointers, >>>>>> references have single level. (reference to reference is just reference >>>>>> while pointer to pointer is double pointer). >>>>>> >>>>> Case of reference to reference, second level can be handled using >>>>>> DW_OP_LLVM_explicit_pointer itself. >>>>>> Case of pointer to reference, second level can be handled using >>>>>> DW_OP_implicit_pointer. >>>>>> >>>>>> Though it would not be complex to make explicit pointer multilevel, I >>>>>> avoided so due to lack of use case. Please let me know if I am missing >>>>>> something. >>>>>> >>>>> >>>>> Sorry, I couldn't understand your language related to references and >>>>> pointers - I don't understand why they would be handled differently or >>>>> represent challenges/tradeoffs for features related to collapsed >>>>> indirection like this. >>>>> >>>>> Multi-level indirection seems to have as much use as single level >>>>> indirection. (if a DWARF user may want to know what a pointer points to >>>>> even when what it points to isn't in memory, the same would hold true for >>>>> pointers to pointers, etc) >>>>> >>>>> I would expect this to be handled with a general OP saying "hey, I'm >>>>> skipping one level of indirection indirection in the resulting value, >>>>> because that indirection is missing/not in the final program" and that this >>>>> would be encoded in a llvm.dbg.value/DIExpression as usual, without the >>>>> need for new IR intrinsics, though possibly with the need for an LLVM >>>>> extension DWARF OP (DW_OP_LLVM_explicit_pointer?) >>>>> >>>>> To reconstitute that general form into the current DWARF limited >>>>> "indirection needs to refer to another variable DIE" issue - as I think >>>>> Paul speculated previously, we could always reconstitute a synthetic >>>>> variable DIE & not try to reflect the case where the indirection lands at >>>>> another named/known variable - as I expect that's the minority case. In >>>>> most cases in C++ I expect pointers and references do not refer to named >>>>> variables in the same function. They refer to return values from functions, >>>>> they refer to array elements in dynamically allocated arrays, etc, etc. >>>>> >>>>> >>>>>> >>>>>> Regards, >>>>>> Alok >>>>>> >>>>>> >>>>>> On Fri, Nov 29, 2019 at 10:12 AM Alok Sharma < >>>>>> aloksharma.knit at gmail.com> wrote: >>>>>> >>>>>>> Let me try to summarize the implementation first. >>>>>>> >>>>>>> At the moment, there are two branches. >>>>>>> >>>>>>> 1. When an existing variable is optimized out and that variable is >>>>>>> used to get the de-refereced value, pointed to by another pointer/reference >>>>>>> variable. >>>>>>> Such cases are being addressed using Dwarf expression >>>>>>> DW_OP_implicit_pointer as de-referenced value of a pointer can be seen >>>>>>> implicitly (using another variable). Before Dwarf is dumped in LLVM IR, we >>>>>>> represent it using dbg.derefval (which denotes derefereced value of pointer >>>>>>> or reference) and DW_OP_LLVM_implicit_pointer operation. >>>>>>> >>>>>>> 2. When a temporary variable is optimized out and that variable is >>>>>>> used to get de-referenced value of another reference variable (AFAIK it can >>>>>>> not be reproduced with pointers) >>>>>>> Such cases are being addressed using new Dwarf expression >>>>>>> DW_OP_explicit_pointer as de-referenced value can be displayed explicitly >>>>>>> (in place). In LLVM IR, we represent it using dbg.derefval and >>>>>>> DW_OP_LLVM_explicit_pointer operation. >>>>>>> >>>>>>> Both of these two branches have some common implementation to define >>>>>>> new operations (Dwarf and IR). (D70642, D70643, D69999, D69886). >>>>>>> First branch has additional patches (D70260, 70384, D70385, D70419). >>>>>>> Second branch has additional patch ( D70833). >>>>>>> >>>>>>> Let me try to comment on points raised by you. >>>>>>> - Branch 2, (patch D70833) handles cases when temporaries (not >>>>>>> existing variables) are optimized out. >>>>>>> - In patch D70385, I have included test points to display that multi >>>>>>> layered pointers are working >>>>>>> (llvm/test/DebugInfo/dwarfdump-implicit_pointer_mem2reg.c). >>>>>>> >>>>>>> I feel that review of branch 1 (implicit pointer) can be resumed >>>>>>> (which was halted due to current discussion), while we can continue to >>>>>>> discuss branch 2 (explicit pointers D7083) if you want. David, what do you >>>>>>> think? >>>>>>> >>>>>>> Regards, >>>>>>> Alok >>>>>>> >>>>>>> On Fri, Nov 29, 2019 at 4:40 AM David Blaikie <dblaikie at gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Sorry I haven't been more engaged with this thread, I have been >>>>>>>> reading it, so hopefully my reply isn't completely out of line/irrelevant - >>>>>>>> but I still feel like having a custom dwarf expression operator (& no new >>>>>>>> intrinsics), like we have for one or two other DW_OP_LLVM_* (that aren't >>>>>>>> actually generated into the DWARF - though this one perhaps could be in >>>>>>>> some/all cases as an extension, maybe - or a synthesized variable could be >>>>>>>> created for compatibility with the current DWARF standard) would make the >>>>>>>> most sense. >>>>>>>> >>>>>>>> Some thought experiments that I think are relevant: >>>>>>>> * does the proposed IR format scale to pointers that don't point to >>>>>>>> existing variables (that I think has already been touched on in this thread) >>>>>>>> * does the proposed IR format support multiple layers of >>>>>>>> dereference (eg: int ** where we know it ultimately points to the value 3 >>>>>>>> but can't describe either the first or second level pointers that get to >>>>>>>> that value) - it sounds like any intrinsic that's special cased to deref >>>>>>>> (like llvm.dbg.derefval) wouldn't be able to capture that, which seems like >>>>>>>> it's overly narrow/special case, then? >>>>>>>> >>>>>>>> On Thu, Nov 28, 2019 at 2:29 PM Alok Sharma via llvm-dev < >>>>>>>> llvm-dev at lists.llvm.org> wrote: >>>>>>>> >>>>>>>>> Hi folks, >>>>>>>>> >>>>>>>>> I am pushing a PoC patch https://reviews.llvm.org/D70833 for >>>>>>>>> review which includes the case when temporary is promoted. >>>>>>>>> >>>>>>>>> For such cases it generates IR as >>>>>>>>> >>>>>>>>> call void @llvm.dbg.derefval(metadata i32 3, metadata !25, >>>>>>>>> metadata !DIExpression(DW_OP_LLVM_explicit_pointer, DW_OP_LLVM_arg0)), !dbg >>>>>>>>> !32 >>>>>>>>> >>>>>>>>> And llvm-darfdump output looks like >>>>>>>>> >>>>>>>>> ------------- >>>>>>>>> 0x0000007b: DW_TAG_inlined_subroutine >>>>>>>>> DW_AT_abstract_origin (0x0000004f "_Z4sinkRKi") >>>>>>>>> DW_AT_low_pc (0x00000000004004c6) >>>>>>>>> DW_AT_high_pc (0x00000000004004d0) >>>>>>>>> DW_AT_call_file >>>>>>>>> ("/home/alok/openllvm/llvm-project_derefval/build.d/david.cc") >>>>>>>>> DW_AT_call_line (10) >>>>>>>>> DW_AT_call_column (0x03) >>>>>>>>> >>>>>>>>> 0x00000088: DW_TAG_formal_parameter >>>>>>>>> DW_AT_location (indexed (0x0) loclist >>>>>>>>> 0x00000010: >>>>>>>>> [0x00000000004004c6, 0x00000000004004d4): >>>>>>>>> DW_OP_explicit_pointer, DW_OP_lit3) >>>>>>>>> DW_AT_abstract_origin (0x00000055 "p") >>>>>>>>> ------------ >>>>>>>>> >>>>>>>>> Please note that DW_OP_explicit_pointer denotes that following >>>>>>>>> value represents de-referenced value of optimized out pointer. With >>>>>>>>> necessary changes in LLDB debugger this dwarf info can help to detect the >>>>>>>>> explicit de-referenced value of 'p'. >>>>>>>>> >>>>>>>>> Hi David, >>>>>>>>> >>>>>>>>> Should we keep on working for the above case separately and resume >>>>>>>>> the review of implicit pointer independently now, which is updated with >>>>>>>>> many suggestions from this discussion? >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Alok >>>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, Nov 20, 2019 at 11:24 PM Jeremy Morse < >>>>>>>>> jeremy.morse.llvm at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> For a new way of representing things, >>>>>>>>>> >>>>>>>>>> Adrian wrote: >>>>>>>>>> > llvm.dbg.value_new(DILocalVariable("y"), >>>>>>>>>> DIExpression(DW_OP_LLVM_arg0, DW_OP_LLVM_arg1, DW_OP_plus), >>>>>>>>>> > %ptr, %ofs) >>>>>>>>>> >>>>>>>>>> I think this would be great -- there're definitely some constructs >>>>>>>>>> created by the induction-variables pass and similar where one >>>>>>>>>> could >>>>>>>>>> recover an implicit variable value, if you could for example >>>>>>>>>> subtract >>>>>>>>>> one pointer from another. >>>>>>>>>> >>>>>>>>>> With the current model of storing DIExpressions as a vector of >>>>>>>>>> opcodes, it might become a pain to salvage a Value that gets >>>>>>>>>> optimised >>>>>>>>>> out --in the example, if %ofs were salvaged, presumably >>>>>>>>>> DW_OP_LLVM_arg1 could have to be replaced with several extra >>>>>>>>>> operations. This isn't insurmountable, but I've repeatedly shied >>>>>>>>>> away >>>>>>>>>> from scanning through DIExpressions to patch them up. A vector of >>>>>>>>>> opcodes is the final output of the compiler, IMHO richer metadata >>>>>>>>>> should be used in the meantime. >>>>>>>>>> >>>>>>>>>> IMHO the implicit pointer work doesn't need to block on this. As >>>>>>>>>> said >>>>>>>>>> my mild preference would be for a new intrinsic for this form of >>>>>>>>>> variable location. >>>>>>>>>> >>>>>>>>>> ~ >>>>>>>>>> >>>>>>>>>> Inre PR37682, >>>>>>>>>> >>>>>>>>>> > I’ve been reminded of PR37682, where a function with a >>>>>>>>>> reference parameter might spend all its time computing the “referenced” >>>>>>>>>> value in a temp, and only move the final value back to the referenced >>>>>>>>>> object at the end. This is clearly a situation that could benefit from >>>>>>>>>> DW_OP_implicit_pointer, and there is really no other-object DIE for it to >>>>>>>>>> refer to. Given the current spec, the compiler would need to produce a >>>>>>>>>> DW_TAG_dwarf_procedure for the parameter DIE to refer to. Appendix D >>>>>>>>>> (Figure D.61) has an example of this construction, although it’s a more >>>>>>>>>> contrived source example. >>>>>>>>>> >>>>>>>>>> This has been working through my mind too, and I think it's >>>>>>>>>> slightly >>>>>>>>>> different to what implicit_pointer is trying to achieve. In the >>>>>>>>>> case >>>>>>>>>> implicit_pointer is designed for, it's a strict improvement in >>>>>>>>>> debug >>>>>>>>>> experience because you're recovering information that couldn't be >>>>>>>>>> expressed. However for PR37682 it's a trade-off between whether >>>>>>>>>> the >>>>>>>>>> user might want to examine the pointer, or the pointed-at integer: >>>>>>>>>> AFAIUI, we can only express one of the two, not both. Wheras for >>>>>>>>>> mem2reg'd variables referred to by DIE, there is never a pointer >>>>>>>>>> to be >>>>>>>>>> lost. >>>>>>>>>> >>>>>>>>>> I think my preference would always be to see temporarily-promoted >>>>>>>>>> values as there's no other way of observing them, but others might >>>>>>>>>> disagree. >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Thanks, >>>>>>>>>> Jeremy >>>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> LLVM Developers mailing list >>>>>>>>> llvm-dev at lists.llvm.org >>>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>>>>>> >>>>>>>>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200102/19323e55/attachment-0001.html>