thr3ads.net - llvm dev - [llvm-dev] DW_OP_implicit_pointer design/implementation in general [Dec 2019]

If this information is useful, please help other people find it:
Share via:

David Blaikie via llvm-dev

2019-Dec-18 23:24 UTC

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

(I'm still pretty concerned that there are IR changes going in for a
feature that seems incomplete and more invasive than really seems justified
to me - though I admit I'm clearly not paying enough attention to this
feature to have a nuanced/fully informed opinion & so maybe I just need to
step back from all of this - but given the addition of new intrinsics, it
seems like there should be more clear design discussion)

On Tue, Dec 10, 2019 at 9:06 PM Alok Sharma <aloksharma.knit at gmail.com>
wrote:
> Hi David,
>
> This is regarding missing multilevel handling in branch for explicit
> pointers.
>
> > * does the proposed IR format support multiple layers of dereference
> (eg: int ** where we know it ultimately points to the value 3 but can't
> describe either the first or second level pointers that get to that value)
> - it sounds like any intrinsic that's special cased to deref (like
> llvm.dbg.derefval) wouldn't be able to capture that, which seems like
it's
> overly narrow/special case, then?
>
> The PoC of DW_OP_LLVM_explicit_pointer does not have handling of
> multilevel indirection. As of now it is so due to below reason.
>
>  Explicit pointer handles cases when variable points to a temporary which
> contains constant. Due to language standard constraints, we don't find
> pointers in such cases, what we get is references. Unlike pointers,
> references have single level. (reference to reference is just reference
> while pointer to pointer is double pointer).
> Case of reference to reference,  second level can be handled
using> DW_OP_LLVM_explicit_pointer itself.
>  Case of pointer to reference, second level can be handled using
> DW_OP_implicit_pointer.
>
> Though it would not be complex to make explicit pointer multilevel, I
> avoided so due to lack of use case. Please let me know if I am missing
> something.
>
Sorry, I couldn't understand your language related to references and
pointers - I don't understand why they would be handled differently or
represent challenges/tradeoffs for features related to collapsed
indirection like this.

Multi-level indirection seems to have as much use as single level
indirection. (if a DWARF user may want to know what a pointer points to
even when what it points to isn't in memory, the same would hold true for
pointers to pointers, etc)

I would expect this to be handled with a general OP saying "hey, I'm
skipping one level of indirection indirection in the resulting value,
because that indirection is missing/not in the final program" and that this
would be encoded in a llvm.dbg.value/DIExpression as usual, without the
need for new IR intrinsics, though possibly with the need for an LLVM
extension DWARF OP (DW_OP_LLVM_explicit_pointer?)

To reconstitute that general form into the current DWARF limited
"indirection needs to refer to another variable DIE" issue - as I
think
Paul speculated previously, we could always reconstitute a synthetic
variable DIE & not try to reflect the case where the indirection lands at
another named/known variable - as I expect that's the minority case. In
most cases in C++ I expect pointers and references do not refer to named
variables in the same function. They refer to return values from functions,
they refer to array elements in dynamically allocated arrays, etc, etc.

>
> Regards,
> Alok
>
>
> On Fri, Nov 29, 2019 at 10:12 AM Alok Sharma <aloksharma.knit at
gmail.com>
> wrote:
>
>> Let me try to summarize the implementation first.
>>
>> At the moment, there are two branches.
>>
>> 1. When an existing variable is optimized out and that variable is used
>> to get the de-refereced value, pointed to by another pointer/reference
>> variable.
>>   Such cases are being addressed using Dwarf expression
>> DW_OP_implicit_pointer as de-referenced value of a pointer can be seen
>> implicitly (using another variable). Before Dwarf is dumped in LLVM IR,
we
>> represent it using dbg.derefval (which denotes derefereced value of
pointer
>> or reference) and DW_OP_LLVM_implicit_pointer operation.
>>
>> 2. When a temporary variable is optimized out and that variable is used
>> to get de-referenced value of another reference variable (AFAIK it can
not
>> be reproduced with pointers)
>>   Such cases are being addressed using new Dwarf expression
>> DW_OP_explicit_pointer as de-referenced value can be displayed
explicitly
>> (in place). In LLVM IR, we represent it using dbg.derefval and
>> DW_OP_LLVM_explicit_pointer operation.
>>
>> Both of these two branches have some common implementation to define
new
>> operations (Dwarf and IR). (D70642, D70643, D69999, D69886).
>> First branch has additional patches (D70260, 70384, D70385, D70419).
>> Second branch has additional patch ( D70833).
>>
>> Let me try to comment on points raised by you.
>> - Branch 2, (patch D70833) handles cases when temporaries (not existing
>> variables) are optimized out.
>> - In patch D70385, I have included test points to display that multi
>> layered pointers are working
>> (llvm/test/DebugInfo/dwarfdump-implicit_pointer_mem2reg.c).
>>
>> I feel that review of branch 1 (implicit pointer) can be resumed (which
>> was halted due to current discussion), while we can continue to discuss
>> branch 2 (explicit pointers D7083) if you want. David, what do you
think?
>>
>> Regards,
>> Alok
>>
>> On Fri, Nov 29, 2019 at 4:40 AM David Blaikie <dblaikie at
gmail.com> wrote:
>>
>>> Sorry I haven't been more engaged with this thread, I have been
reading
>>> it, so hopefully my reply isn't completely out of
line/irrelevant - but I
>>> still feel like having a custom dwarf expression operator (& no
new
>>> intrinsics), like we have for one or two other DW_OP_LLVM_* (that
aren't
>>> actually generated into the DWARF - though this one perhaps could
be in
>>> some/all cases as an extension, maybe - or a synthesized variable
could be
>>> created for compatibility with the current DWARF standard) would
make the
>>> most sense.
>>>
>>> Some thought experiments that I think are relevant:
>>> * does the proposed IR format scale to pointers that don't
point to
>>> existing variables (that I think has already been touched on in
this thread)
>>> * does the proposed IR format support multiple layers of
dereference
>>> (eg: int ** where we know it ultimately points to the value 3 but
can't
>>> describe either the first or second level pointers that get to that
value)
>>> - it sounds like any intrinsic that's special cased to deref
(like
>>> llvm.dbg.derefval) wouldn't be able to capture that, which
seems like it's
>>> overly narrow/special case, then?
>>>
>>> On Thu, Nov 28, 2019 at 2:29 PM Alok Sharma via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> Hi folks,
>>>>
>>>> I am pushing a PoC patch https://reviews.llvm.org/D70833 for
review
>>>> which includes the case when temporary is promoted.
>>>>
>>>> For such cases it generates IR as
>>>>
>>>>   call void @llvm.dbg.derefval(metadata i32 3, metadata !25,
metadata
>>>> !DIExpression(DW_OP_LLVM_explicit_pointer, DW_OP_LLVM_arg0)),
!dbg !32
>>>>
>>>> And llvm-darfdump output looks like
>>>>
>>>> -------------
>>>> 0x0000007b:     DW_TAG_inlined_subroutine
>>>>                   DW_AT_abstract_origin (0x0000004f
"_Z4sinkRKi")
>>>>                   DW_AT_low_pc  (0x00000000004004c6)
>>>>                   DW_AT_high_pc (0x00000000004004d0)
>>>>                   DW_AT_call_file
>>>>
("/home/alok/openllvm/llvm-project_derefval/build.d/david.cc")
>>>>                   DW_AT_call_line       (10)
>>>>                   DW_AT_call_column     (0x03)
>>>>
>>>> 0x00000088:       DW_TAG_formal_parameter
>>>>                     DW_AT_location      (indexed (0x0) loclist
>>>> 0x00000010:
>>>>                        [0x00000000004004c6,
0x00000000004004d4):
>>>> DW_OP_explicit_pointer, DW_OP_lit3)
>>>>                     DW_AT_abstract_origin       (0x00000055
"p")
>>>> ------------
>>>>
>>>> Please note that DW_OP_explicit_pointer denotes that following
value
>>>> represents de-referenced value of optimized out pointer. With
necessary
>>>> changes in LLDB debugger this dwarf info can help to detect the
explicit
>>>> de-referenced value of 'p'.
>>>>
>>>> Hi David,
>>>>
>>>> Should we keep on working for the above case separately and
resume the
>>>> review of implicit pointer independently now, which is updated
with many
>>>> suggestions from this discussion?
>>>>
>>>> Regards,
>>>> Alok
>>>>
>>>>
>>>> On Wed, Nov 20, 2019 at 11:24 PM Jeremy Morse <
>>>> jeremy.morse.llvm at gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> For a new way of representing things,
>>>>>
>>>>> Adrian wrote:
>>>>> > llvm.dbg.value_new(DILocalVariable("y"),
>>>>> DIExpression(DW_OP_LLVM_arg0, DW_OP_LLVM_arg1, DW_OP_plus),
>>>>> >                    %ptr, %ofs)
>>>>>
>>>>> I think this would be great -- there're definitely some
constructs
>>>>> created by the induction-variables pass and similar where
one could
>>>>> recover an implicit variable value, if you could for
example subtract
>>>>> one pointer from another.
>>>>>
>>>>> With the current model of storing DIExpressions as a vector
of
>>>>> opcodes, it might become a pain to salvage a Value that
gets optimised
>>>>> out --in the example, if %ofs were salvaged, presumably
>>>>> DW_OP_LLVM_arg1 could have to be replaced with several
extra
>>>>> operations. This isn't insurmountable, but I've
repeatedly shied away
>>>>> from scanning through DIExpressions to patch them up. A
vector of
>>>>> opcodes is the final output of the compiler, IMHO richer
metadata
>>>>> should be used in the meantime.
>>>>>
>>>>> IMHO the implicit pointer work doesn't need to block on
this. As said
>>>>> my mild preference would be for a new intrinsic for this
form of
>>>>> variable location.
>>>>>
>>>>> ~
>>>>>
>>>>> Inre PR37682,
>>>>>
>>>>> > I’ve been reminded of PR37682, where a function with a
reference
>>>>> parameter might spend all its time computing the
“referenced” value in a
>>>>> temp, and only move the final value back to the referenced
object at the
>>>>> end.  This is clearly a situation that could benefit from
>>>>> DW_OP_implicit_pointer, and there is really no other-object
DIE for it to
>>>>> refer to.  Given the current spec, the compiler would need
to produce a
>>>>> DW_TAG_dwarf_procedure for the parameter DIE to refer to. 
Appendix D
>>>>> (Figure D.61) has an example of this construction, although
it’s a more
>>>>> contrived source example.
>>>>>
>>>>> This has been working through my mind too, and I think
it's slightly
>>>>> different to what implicit_pointer is trying to achieve. In
the case
>>>>> implicit_pointer is designed for, it's a strict
improvement in debug
>>>>> experience because you're recovering information that
couldn't be
>>>>> expressed. However for PR37682 it's a trade-off between
whether the
>>>>> user might want to examine the pointer, or the pointed-at
integer:
>>>>> AFAIUI, we can only express one of the two, not both.
Wheras for
>>>>> mem2reg'd variables referred to by DIE, there is never
a pointer to be
>>>>> lost.
>>>>>
>>>>> I think my preference would always be to see
temporarily-promoted
>>>>> values as there's no other way of observing them, but
others might
>>>>> disagree.
>>>>>
>>>>> --
>>>>> Thanks,
>>>>> Jeremy
>>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20191218/a6d02451/attachment.html>

Robinson, Paul via llvm-dev

2019-Dec-19 16:27 UTC

head link

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

I regret to say I also have not been following this with the attention it
deserves, and I am pretty much on holiday until 14 January.
I am particularly surprised by the appearance of something called
DW_OP_LLVM_explicit_pointer, which I wouldn’t have thought necessary and don’t
remember from the discussions that I did read.
I will try to mend my ways and pay more attention when I return.
--paulr

From: David Blaikie <dblaikie at gmail.com>
Sent: Wednesday, December 18, 2019 6:24 PM
To: Alok Sharma <aloksharma.knit at gmail.com>; Adrian Prantl <aprantl
at apple.com>; Jonas Devlieghere <jdevlieghere at apple.com>; Robinson,
Paul <paul.robinson at sony.com>
Cc: Jeremy Morse <jeremy.morse.llvm at gmail.com>; llvm-dev <llvm-dev
at lists.llvm.org>; AlokKumar.Sharma at amd.com; Vedant Kumar
<vedant_kumar at apple.com>
Subject: Re: [llvm-dev] DW_OP_implicit_pointer design/implementation in general

(I'm still pretty concerned that there are IR changes going in for a feature
that seems incomplete and more invasive than really seems justified to me -
though I admit I'm clearly not paying enough attention to this feature to
have a nuanced/fully informed opinion & so maybe I just need to step back
from all of this - but given the addition of new intrinsics, it seems like there
should be more clear design discussion)

On Tue, Dec 10, 2019 at 9:06 PM Alok Sharma <aloksharma.knit at
gmail.com<mailto:aloksharma.knit at gmail.com>> wrote:
Hi David,

This is regarding missing multilevel handling in branch for explicit pointers.
> * does the proposed IR format support multiple layers of dereference (eg:
int ** where we know it ultimately points to the value 3 but can't describe
either the first or second level pointers that get to that value) - it sounds
like any intrinsic that's special cased to deref (like llvm.dbg.derefval)
wouldn't be able to capture that, which seems like it's overly
narrow/special case, then?
The PoC of DW_OP_LLVM_explicit_pointer does not have handling of multilevel
indirection. As of now it is so due to below reason.

 Explicit pointer handles cases when variable points to a temporary which
contains constant. Due to language standard constraints, we don't find
pointers in such cases, what we get is references. Unlike pointers, references
have single level. (reference to reference is just reference while pointer to
pointer is double pointer).
 Case of reference to reference,  second level can be handled using
DW_OP_LLVM_explicit_pointer itself.
 Case of pointer to reference, second level can be handled using
DW_OP_implicit_pointer.

Though it would not be complex to make explicit pointer multilevel, I avoided so
due to lack of use case. Please let me know if I am missing something.

Sorry, I couldn't understand your language related to references and
pointers - I don't understand why they would be handled differently or
represent challenges/tradeoffs for features related to collapsed indirection
like this.

Multi-level indirection seems to have as much use as single level indirection.
(if a DWARF user may want to know what a pointer points to even when what it
points to isn't in memory, the same would hold true for pointers to
pointers, etc)

I would expect this to be handled with a general OP saying "hey, I'm
skipping one level of indirection indirection in the resulting value, because
that indirection is missing/not in the final program" and that this would
be encoded in a llvm.dbg.value/DIExpression as usual, without the need for new
IR intrinsics, though possibly with the need for an LLVM extension DWARF OP
(DW_OP_LLVM_explicit_pointer?)

To reconstitute that general form into the current DWARF limited
"indirection needs to refer to another variable DIE" issue - as I
think Paul speculated previously, we could always reconstitute a synthetic
variable DIE & not try to reflect the case where the indirection lands at
another named/known variable - as I expect that's the minority case. In most
cases in C++ I expect pointers and references do not refer to named variables in
the same function. They refer to return values from functions, they refer to
array elements in dynamically allocated arrays, etc, etc.

Regards,
Alok

On Fri, Nov 29, 2019 at 10:12 AM Alok Sharma <aloksharma.knit at
gmail.com<mailto:aloksharma.knit at gmail.com>> wrote:
Let me try to summarize the implementation first.

At the moment, there are two branches.

1. When an existing variable is optimized out and that variable is used to get
the de-refereced value, pointed to by another pointer/reference variable.
  Such cases are being addressed using Dwarf expression DW_OP_implicit_pointer
as de-referenced value of a pointer can be seen implicitly (using another
variable). Before Dwarf is dumped in LLVM IR, we represent it using dbg.derefval
(which denotes derefereced value of pointer or reference) and
DW_OP_LLVM_implicit_pointer operation.

2. When a temporary variable is optimized out and that variable is used to get
de-referenced value of another reference variable (AFAIK it can not be
reproduced with pointers)
  Such cases are being addressed using new Dwarf expression
DW_OP_explicit_pointer as de-referenced value can be displayed explicitly (in
place). In LLVM IR, we represent it using dbg.derefval and
DW_OP_LLVM_explicit_pointer operation.

Both of these two branches have some common implementation to define new
operations (Dwarf and IR). (D70642, D70643, D69999, D69886).
First branch has additional patches (D70260, 70384, D70385, D70419).
Second branch has additional patch ( D70833).

Let me try to comment on points raised by you.
- Branch 2, (patch D70833) handles cases when temporaries (not existing
variables) are optimized out.
- In patch D70385, I have included test points to display that multi layered
pointers are working (llvm/test/DebugInfo/dwarfdump-implicit_pointer_mem2reg.c).

I feel that review of branch 1 (implicit pointer) can be resumed (which was
halted due to current discussion), while we can continue to discuss branch 2
(explicit pointers D7083) if you want. David, what do you think?

Regards,
Alok

On Fri, Nov 29, 2019 at 4:40 AM David Blaikie <dblaikie at
gmail.com<mailto:dblaikie at gmail.com>> wrote:
Sorry I haven't been more engaged with this thread, I have been reading it,
so hopefully my reply isn't completely out of line/irrelevant - but I still
feel like having a custom dwarf expression operator (& no new intrinsics),
like we have for one or two other DW_OP_LLVM_* (that aren't actually
generated into the DWARF - though this one perhaps could be in some/all cases as
an extension, maybe - or a synthesized variable could be created for
compatibility with the current DWARF standard) would make the most sense.

Some thought experiments that I think are relevant:
* does the proposed IR format scale to pointers that don't point to existing
variables (that I think has already been touched on in this thread)
* does the proposed IR format support multiple layers of dereference (eg: int **
where we know it ultimately points to the value 3 but can't describe either
the first or second level pointers that get to that value) - it sounds like any
intrinsic that's special cased to deref (like llvm.dbg.derefval)
wouldn't be able to capture that, which seems like it's overly
narrow/special case, then?

On Thu, Nov 28, 2019 at 2:29 PM Alok Sharma via llvm-dev <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
Hi folks,

I am pushing a PoC patch https://reviews.llvm.org/D70833 for review which
includes the case when temporary is promoted.

For such cases it generates IR as

  call void @llvm.dbg.derefval(metadata i32 3, metadata !25, metadata
!DIExpression(DW_OP_LLVM_explicit_pointer, DW_OP_LLVM_arg0)), !dbg !32

And llvm-darfdump output looks like

-------------
0x0000007b:     DW_TAG_inlined_subroutine
                  DW_AT_abstract_origin (0x0000004f "_Z4sinkRKi")
                  DW_AT_low_pc  (0x00000000004004c6)
                  DW_AT_high_pc (0x00000000004004d0)
                  DW_AT_call_file      
("/home/alok/openllvm/llvm-project_derefval/build.d/david.cc")
                  DW_AT_call_line       (10)
                  DW_AT_call_column     (0x03)

0x00000088:       DW_TAG_formal_parameter
                    DW_AT_location      (indexed (0x0) loclist = 0x00000010:
                       [0x00000000004004c6, 0x00000000004004d4):
DW_OP_explicit_pointer, DW_OP_lit3)
                    DW_AT_abstract_origin       (0x00000055 "p")
------------

Please note that DW_OP_explicit_pointer denotes that following value represents
de-referenced value of optimized out pointer. With necessary changes in LLDB
debugger this dwarf info can help to detect the explicit de-referenced value of
'p'.

Hi David,

Should we keep on working for the above case separately and resume the review of
implicit pointer independently now, which is updated with many suggestions from
this discussion?

Regards,
Alok

On Wed, Nov 20, 2019 at 11:24 PM Jeremy Morse <jeremy.morse.llvm at
gmail.com<mailto:jeremy.morse.llvm at gmail.com>> wrote:
Hi,

For a new way of representing things,

Adrian wrote:> llvm.dbg.value_new(DILocalVariable("y"),
DIExpression(DW_OP_LLVM_arg0, DW_OP_LLVM_arg1, DW_OP_plus),
>                    %ptr, %ofs)
I think this would be great -- there're definitely some constructs
created by the induction-variables pass and similar where one could
recover an implicit variable value, if you could for example subtract
one pointer from another.

With the current model of storing DIExpressions as a vector of
opcodes, it might become a pain to salvage a Value that gets optimised
out --in the example, if %ofs were salvaged, presumably
DW_OP_LLVM_arg1 could have to be replaced with several extra
operations. This isn't insurmountable, but I've repeatedly shied away
from scanning through DIExpressions to patch them up. A vector of
opcodes is the final output of the compiler, IMHO richer metadata
should be used in the meantime.

IMHO the implicit pointer work doesn't need to block on this. As said
my mild preference would be for a new intrinsic for this form of
variable location.

~

Inre PR37682,
> I’ve been reminded of PR37682, where a function with a reference parameter
might spend all its time computing the “referenced” value in a temp, and only
move the final value back to the referenced object at the end.  This is clearly
a situation that could benefit from DW_OP_implicit_pointer, and there is really
no other-object DIE for it to refer to.  Given the current spec, the compiler
would need to produce a DW_TAG_dwarf_procedure for the parameter DIE to refer
to.  Appendix D (Figure D.61) has an example of this construction, although it’s
a more contrived source example.
This has been working through my mind too, and I think it's slightly
different to what implicit_pointer is trying to achieve. In the case
implicit_pointer is designed for, it's a strict improvement in debug
experience because you're recovering information that couldn't be
expressed. However for PR37682 it's a trade-off between whether the
user might want to examine the pointer, or the pointed-at integer:
AFAIUI, we can only express one of the two, not both. Wheras for
mem2reg'd variables referred to by DIE, there is never a pointer to be
lost.

I think my preference would always be to see temporarily-promoted
values as there's no other way of observing them, but others might
disagree.

--
Thanks,
Jeremy
_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20191219/1e8544b0/attachment.html>

David Blaikie via llvm-dev

2019-Dec-20 00:48 UTC

head link

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

I think the new OP_LLVM extension might've been in response to my
suggestion for something more general, that could handle multiple
indirections to things that weren't existing variables, etc. But I might be
wrong on that.

On Thu, Dec 19, 2019 at 8:27 AM Robinson, Paul <paul.robinson at sony.com>
wrote:
> I regret to say I also have not been following this with the attention it
> deserves, and I am pretty much on holiday until 14 January.
>
> I am particularly surprised by the appearance of something called
> DW_OP_LLVM_explicit_pointer, which I wouldn’t have thought necessary and
> don’t remember from the discussions that I did read.
>
> I will try to mend my ways and pay more attention when I return.
>
> --paulr
>
>
>
> *From:* David Blaikie <dblaikie at gmail.com>
> *Sent:* Wednesday, December 18, 2019 6:24 PM
> *To:* Alok Sharma <aloksharma.knit at gmail.com>; Adrian Prantl <
> aprantl at apple.com>; Jonas Devlieghere <jdevlieghere at
apple.com>; Robinson,
> Paul <paul.robinson at sony.com>
> *Cc:* Jeremy Morse <jeremy.morse.llvm at gmail.com>; llvm-dev <
> llvm-dev at lists.llvm.org>; AlokKumar.Sharma at amd.com; Vedant Kumar
<
> vedant_kumar at apple.com>
> *Subject:* Re: [llvm-dev] DW_OP_implicit_pointer design/implementation in
> general
>
>
>
> (I'm still pretty concerned that there are IR changes going in for a
> feature that seems incomplete and more invasive than really seems justified
> to me - though I admit I'm clearly not paying enough attention to this
> feature to have a nuanced/fully informed opinion & so maybe I just need
to
> step back from all of this - but given the addition of new intrinsics, it
> seems like there should be more clear design discussion)
>
>
>
> On Tue, Dec 10, 2019 at 9:06 PM Alok Sharma <aloksharma.knit at
gmail.com>
> wrote:
>
> Hi David,
>
>
>
> This is regarding missing multilevel handling in branch for explicit
> pointers.
>
>
>
> > * does the proposed IR format support multiple layers of dereference
> (eg: int ** where we know it ultimately points to the value 3 but can't
> describe either the first or second level pointers that get to that value)
> - it sounds like any intrinsic that's special cased to deref (like
> llvm.dbg.derefval) wouldn't be able to capture that, which seems like
it's
> overly narrow/special case, then?
>
>
>
> The PoC of DW_OP_LLVM_explicit_pointer does not have handling of
> multilevel indirection. As of now it is so due to below reason.
>
>
>
>  Explicit pointer handles cases when variable points to a temporary which
> contains constant. Due to language standard constraints, we don't find
> pointers in such cases, what we get is references. Unlike pointers,
> references have single level. (reference to reference is just reference
> while pointer to pointer is double pointer).
>
>  Case of reference to reference,  second level can be handled using
> DW_OP_LLVM_explicit_pointer itself.
>
>  Case of pointer to reference, second level can be handled using
> DW_OP_implicit_pointer.
>
>
>
> Though it would not be complex to make explicit pointer multilevel, I
> avoided so due to lack of use case. Please let me know if I am missing
> something.
>
>
> Sorry, I couldn't understand your language related to references and
> pointers - I don't understand why they would be handled differently or
> represent challenges/tradeoffs for features related to collapsed
> indirection like this.
>
> Multi-level indirection seems to have as much use as single level
> indirection. (if a DWARF user may want to know what a pointer points to
> even when what it points to isn't in memory, the same would hold true
for
> pointers to pointers, etc)
>
> I would expect this to be handled with a general OP saying "hey,
I'm
> skipping one level of indirection indirection in the resulting value,
> because that indirection is missing/not in the final program" and that
this
> would be encoded in a llvm.dbg.value/DIExpression as usual, without the
> need for new IR intrinsics, though possibly with the need for an LLVM
> extension DWARF OP (DW_OP_LLVM_explicit_pointer?)
>
> To reconstitute that general form into the current DWARF limited
> "indirection needs to refer to another variable DIE" issue - as I
think
> Paul speculated previously, we could always reconstitute a synthetic
> variable DIE & not try to reflect the case where the indirection lands
at
> another named/known variable - as I expect that's the minority case. In
> most cases in C++ I expect pointers and references do not refer to named
> variables in the same function. They refer to return values from functions,
> they refer to array elements in dynamically allocated arrays, etc, etc.
>
>
>
>
> Regards,
>
> Alok
>
>
>
>
>
> On Fri, Nov 29, 2019 at 10:12 AM Alok Sharma <aloksharma.knit at
gmail.com>
> wrote:
>
> Let me try to summarize the implementation first.
>
>
>
> At the moment, there are two branches.
>
>
>
> 1. When an existing variable is optimized out and that variable is used to
> get the de-refereced value, pointed to by another pointer/reference
> variable.
>
>   Such cases are being addressed using Dwarf expression
> DW_OP_implicit_pointer as de-referenced value of a pointer can be seen
> implicitly (using another variable). Before Dwarf is dumped in LLVM IR, we
> represent it using dbg.derefval (which denotes derefereced value of pointer
> or reference) and DW_OP_LLVM_implicit_pointer operation.
>
>
>
> 2. When a temporary variable is optimized out and that variable is used to
> get de-referenced value of another reference variable (AFAIK it can not be
> reproduced with pointers)
>
>   Such cases are being addressed using new Dwarf expression
> DW_OP_explicit_pointer as de-referenced value can be displayed explicitly
> (in place). In LLVM IR, we represent it using dbg.derefval and
> DW_OP_LLVM_explicit_pointer operation.
>
>
>
> Both of these two branches have some common implementation to define new
> operations (Dwarf and IR). (D70642, D70643, D69999, D69886).
>
> First branch has additional patches (D70260, 70384, D70385, D70419).
>
> Second branch has additional patch ( D70833).
>
>
>
> Let me try to comment on points raised by you.
>
> - Branch 2, (patch D70833) handles cases when temporaries (not existing
> variables) are optimized out.
>
> - In patch D70385, I have included test points to display that multi
> layered pointers are working
> (llvm/test/DebugInfo/dwarfdump-implicit_pointer_mem2reg.c).
>
>
>
> I feel that review of branch 1 (implicit pointer) can be resumed (which
> was halted due to current discussion), while we can continue to discuss
> branch 2 (explicit pointers D7083) if you want. David, what do you think?
>
>
>
> Regards,
>
> Alok
>
>
>
> On Fri, Nov 29, 2019 at 4:40 AM David Blaikie <dblaikie at gmail.com>
wrote:
>
> Sorry I haven't been more engaged with this thread, I have been reading
> it, so hopefully my reply isn't completely out of line/irrelevant - but
I
> still feel like having a custom dwarf expression operator (& no new
> intrinsics), like we have for one or two other DW_OP_LLVM_* (that
aren't
> actually generated into the DWARF - though this one perhaps could be in
> some/all cases as an extension, maybe - or a synthesized variable could be
> created for compatibility with the current DWARF standard) would make the
> most sense.
>
> Some thought experiments that I think are relevant:
> * does the proposed IR format scale to pointers that don't point to
> existing variables (that I think has already been touched on in this
thread)
> * does the proposed IR format support multiple layers of dereference (eg:
> int ** where we know it ultimately points to the value 3 but can't
describe
> either the first or second level pointers that get to that value) - it
> sounds like any intrinsic that's special cased to deref (like
> llvm.dbg.derefval) wouldn't be able to capture that, which seems like
it's
> overly narrow/special case, then?
>
>
>
> On Thu, Nov 28, 2019 at 2:29 PM Alok Sharma via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> Hi folks,
>
>
>
> I am pushing a PoC patch https://reviews.llvm.org/D70833 for review which
> includes the case when temporary is promoted.
>
>
>
> For such cases it generates IR as
>
>
>
>   call void @llvm.dbg.derefval(metadata i32 3, metadata !25, metadata
> !DIExpression(DW_OP_LLVM_explicit_pointer, DW_OP_LLVM_arg0)), !dbg !32
>
>
>
> And llvm-darfdump output looks like
>
>
>
> -------------
>
> 0x0000007b:     DW_TAG_inlined_subroutine
>                   DW_AT_abstract_origin (0x0000004f "_Z4sinkRKi")
>                   DW_AT_low_pc  (0x00000000004004c6)
>                   DW_AT_high_pc (0x00000000004004d0)
>                   DW_AT_call_file
> ("/home/alok/openllvm/llvm-project_derefval/build.d/david.cc")
>                   DW_AT_call_line       (10)
>                   DW_AT_call_column     (0x03)
>
> 0x00000088:       DW_TAG_formal_parameter
>                     DW_AT_location      (indexed (0x0) loclist >
0x00000010:
>                        [0x00000000004004c6, 0x00000000004004d4):
> DW_OP_explicit_pointer, DW_OP_lit3)
>                     DW_AT_abstract_origin       (0x00000055 "p")
>
> ------------
>
>
>
> Please note that DW_OP_explicit_pointer denotes that following value
> represents de-referenced value of optimized out pointer. With necessary
> changes in LLDB debugger this dwarf info can help to detect the explicit
> de-referenced value of 'p'.
>
>
>
> Hi David,
>
>
>
> Should we keep on working for the above case separately and resume the
> review of implicit pointer independently now, which is updated with many
> suggestions from this discussion?
>
>
>
> Regards,
>
> Alok
>
>
>
>
>
> On Wed, Nov 20, 2019 at 11:24 PM Jeremy Morse <jeremy.morse.llvm at
gmail.com>
> wrote:
>
> Hi,
>
> For a new way of representing things,
>
> Adrian wrote:
> > llvm.dbg.value_new(DILocalVariable("y"),
DIExpression(DW_OP_LLVM_arg0,
> DW_OP_LLVM_arg1, DW_OP_plus),
> >                    %ptr, %ofs)
>
> I think this would be great -- there're definitely some constructs
> created by the induction-variables pass and similar where one could
> recover an implicit variable value, if you could for example subtract
> one pointer from another.
>
> With the current model of storing DIExpressions as a vector of
> opcodes, it might become a pain to salvage a Value that gets optimised
> out --in the example, if %ofs were salvaged, presumably
> DW_OP_LLVM_arg1 could have to be replaced with several extra
> operations. This isn't insurmountable, but I've repeatedly shied
away
> from scanning through DIExpressions to patch them up. A vector of
> opcodes is the final output of the compiler, IMHO richer metadata
> should be used in the meantime.
>
> IMHO the implicit pointer work doesn't need to block on this. As said
> my mild preference would be for a new intrinsic for this form of
> variable location.
>
> ~
>
> Inre PR37682,
>
> > I’ve been reminded of PR37682, where a function with a reference
> parameter might spend all its time computing the “referenced” value in a
> temp, and only move the final value back to the referenced object at the
> end.  This is clearly a situation that could benefit from
> DW_OP_implicit_pointer, and there is really no other-object DIE for it to
> refer to.  Given the current spec, the compiler would need to produce a
> DW_TAG_dwarf_procedure for the parameter DIE to refer to.  Appendix D
> (Figure D.61) has an example of this construction, although it’s a more
> contrived source example.
>
> This has been working through my mind too, and I think it's slightly
> different to what implicit_pointer is trying to achieve. In the case
> implicit_pointer is designed for, it's a strict improvement in debug
> experience because you're recovering information that couldn't be
> expressed. However for PR37682 it's a trade-off between whether the
> user might want to examine the pointer, or the pointed-at integer:
> AFAIUI, we can only express one of the two, not both. Wheras for
> mem2reg'd variables referred to by DIE, there is never a pointer to be
> lost.
>
> I think my preference would always be to see temporarily-promoted
> values as there's no other way of observing them, but others might
> disagree.
>
> --
> Thanks,
> Jeremy
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20191219/35d1dbdb/attachment.html>

Alok Sharma via llvm-dev

2019-Dec-23 18:26 UTC

head link

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

Hi David,
> Sorry, I couldn't understand your language related to references andpointers - I don't understand why they would be handled differently or
represent challenges/tradeoffs for features related to collapsed
indirection like this.

Let me try to explain what I wanted to convey with an example.

Example of multilevel pointer:

int var;
int *ptr = &var; // first level of indirection
int *ptrptr = &ptr; //second level of indirection

Example of multilevel references:

int var;
int &ref = var; // first level of reference
int &refref = ref; // second level of reference

Though variable refref is reference of another reference but that is still
of type reference.

As I earlier said I am struggling to find a case where multilevel of
indirection is needed with DW_OP_LLVM_explicit_pointer) in case of
*references*, please let me know if you have any example in mind. I shall
modify the patch for multilevel of indirection. (
DW_OP_LLVM_explicit_pointer is used only in case of references)
> Multi-level indirection seems to have as much use as single levelindirection. (if a DWARF user may want to know what a pointer points to
even when what it points to isn't in memory, the same would hold true for
pointers to pointers, etc)

For pointer to pointer, multilevel indirection is already handled. As all
those cases use DW_OP_implicit_pointer.

Regards,
Alok

On Thu, Dec 19, 2019 at 4:54 AM David Blaikie <dblaikie at gmail.com>
wrote:
> (I'm still pretty concerned that there are IR changes going in for a
> feature that seems incomplete and more invasive than really seems justified
> to me - though I admit I'm clearly not paying enough attention to this
> feature to have a nuanced/fully informed opinion & so maybe I just need
to
> step back from all of this - but given the addition of new intrinsics, it
> seems like there should be more clear design discussion)
>
> On Tue, Dec 10, 2019 at 9:06 PM Alok Sharma <aloksharma.knit at
gmail.com>
> wrote:
>
>> Hi David,
>>
>> This is regarding missing multilevel handling in branch for explicit
>> pointers.
>>
>> > * does the proposed IR format support multiple layers of
dereference
>> (eg: int ** where we know it ultimately points to the value 3 but
can't
>> describe either the first or second level pointers that get to that
value)
>> - it sounds like any intrinsic that's special cased to deref (like
>> llvm.dbg.derefval) wouldn't be able to capture that, which seems
like it's
>> overly narrow/special case, then?
>>
>> The PoC of DW_OP_LLVM_explicit_pointer does not have handling of
>> multilevel indirection. As of now it is so due to below reason.
>>
>>  Explicit pointer handles cases when variable points to a temporary
which
>> contains constant. Due to language standard constraints, we don't
find
>> pointers in such cases, what we get is references. Unlike pointers,
>> references have single level. (reference to reference is just reference
>> while pointer to pointer is double pointer).
>>
>  Case of reference to reference,  second level can be handled using
>> DW_OP_LLVM_explicit_pointer itself.
>>  Case of pointer to reference, second level can be handled using
>> DW_OP_implicit_pointer.
>>
>> Though it would not be complex to make explicit pointer multilevel, I
>> avoided so due to lack of use case. Please let me know if I am missing
>> something.
>>
>
> Sorry, I couldn't understand your language related to references and
> pointers - I don't understand why they would be handled differently or
> represent challenges/tradeoffs for features related to collapsed
> indirection like this.
>
> Multi-level indirection seems to have as much use as single level
> indirection. (if a DWARF user may want to know what a pointer points to
> even when what it points to isn't in memory, the same would hold true
for
> pointers to pointers, etc)
>
> I would expect this to be handled with a general OP saying "hey,
I'm
> skipping one level of indirection indirection in the resulting value,
> because that indirection is missing/not in the final program" and that
this
> would be encoded in a llvm.dbg.value/DIExpression as usual, without the
> need for new IR intrinsics, though possibly with the need for an LLVM
> extension DWARF OP (DW_OP_LLVM_explicit_pointer?)
>
> To reconstitute that general form into the current DWARF limited
> "indirection needs to refer to another variable DIE" issue - as I
think
> Paul speculated previously, we could always reconstitute a synthetic
> variable DIE & not try to reflect the case where the indirection lands
at
> another named/known variable - as I expect that's the minority case. In
> most cases in C++ I expect pointers and references do not refer to named
> variables in the same function. They refer to return values from functions,
> they refer to array elements in dynamically allocated arrays, etc, etc.
>
>
>>
>> Regards,
>> Alok
>>
>>
>> On Fri, Nov 29, 2019 at 10:12 AM Alok Sharma <aloksharma.knit at
gmail.com>
>> wrote:
>>
>>> Let me try to summarize the implementation first.
>>>
>>> At the moment, there are two branches.
>>>
>>> 1. When an existing variable is optimized out and that variable is
used
>>> to get the de-refereced value, pointed to by another
pointer/reference
>>> variable.
>>>   Such cases are being addressed using Dwarf expression
>>> DW_OP_implicit_pointer as de-referenced value of a pointer can be
seen
>>> implicitly (using another variable). Before Dwarf is dumped in LLVM
IR, we
>>> represent it using dbg.derefval (which denotes derefereced value of
pointer
>>> or reference) and DW_OP_LLVM_implicit_pointer operation.
>>>
>>> 2. When a temporary variable is optimized out and that variable is
used
>>> to get de-referenced value of another reference variable (AFAIK it
can not
>>> be reproduced with pointers)
>>>   Such cases are being addressed using new Dwarf expression
>>> DW_OP_explicit_pointer as de-referenced value can be displayed
explicitly
>>> (in place). In LLVM IR, we represent it using dbg.derefval and
>>> DW_OP_LLVM_explicit_pointer operation.
>>>
>>> Both of these two branches have some common implementation to
define new
>>> operations (Dwarf and IR). (D70642, D70643, D69999, D69886).
>>> First branch has additional patches (D70260, 70384, D70385,
D70419).
>>> Second branch has additional patch ( D70833).
>>>
>>> Let me try to comment on points raised by you.
>>> - Branch 2, (patch D70833) handles cases when temporaries (not
existing
>>> variables) are optimized out.
>>> - In patch D70385, I have included test points to display that
multi
>>> layered pointers are working
>>> (llvm/test/DebugInfo/dwarfdump-implicit_pointer_mem2reg.c).
>>>
>>> I feel that review of branch 1 (implicit pointer) can be resumed
(which
>>> was halted due to current discussion), while we can continue to
discuss
>>> branch 2 (explicit pointers D7083) if you want. David, what do you
think?
>>>
>>> Regards,
>>> Alok
>>>
>>> On Fri, Nov 29, 2019 at 4:40 AM David Blaikie <dblaikie at
gmail.com>
>>> wrote:
>>>
>>>> Sorry I haven't been more engaged with this thread, I have
been reading
>>>> it, so hopefully my reply isn't completely out of
line/irrelevant - but I
>>>> still feel like having a custom dwarf expression operator
(& no new
>>>> intrinsics), like we have for one or two other DW_OP_LLVM_*
(that aren't
>>>> actually generated into the DWARF - though this one perhaps
could be in
>>>> some/all cases as an extension, maybe - or a synthesized
variable could be
>>>> created for compatibility with the current DWARF standard)
would make the
>>>> most sense.
>>>>
>>>> Some thought experiments that I think are relevant:
>>>> * does the proposed IR format scale to pointers that don't
point to
>>>> existing variables (that I think has already been touched on in
this thread)
>>>> * does the proposed IR format support multiple layers of
dereference
>>>> (eg: int ** where we know it ultimately points to the value 3
but can't
>>>> describe either the first or second level pointers that get to
that value)
>>>> - it sounds like any intrinsic that's special cased to
deref (like
>>>> llvm.dbg.derefval) wouldn't be able to capture that, which
seems like it's
>>>> overly narrow/special case, then?
>>>>
>>>> On Thu, Nov 28, 2019 at 2:29 PM Alok Sharma via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>>> Hi folks,
>>>>>
>>>>> I am pushing a PoC patch https://reviews.llvm.org/D70833
for review
>>>>> which includes the case when temporary is promoted.
>>>>>
>>>>> For such cases it generates IR as
>>>>>
>>>>>   call void @llvm.dbg.derefval(metadata i32 3, metadata
!25, metadata
>>>>> !DIExpression(DW_OP_LLVM_explicit_pointer,
DW_OP_LLVM_arg0)), !dbg !32
>>>>>
>>>>> And llvm-darfdump output looks like
>>>>>
>>>>> -------------
>>>>> 0x0000007b:     DW_TAG_inlined_subroutine
>>>>>                   DW_AT_abstract_origin (0x0000004f
"_Z4sinkRKi")
>>>>>                   DW_AT_low_pc  (0x00000000004004c6)
>>>>>                   DW_AT_high_pc (0x00000000004004d0)
>>>>>                   DW_AT_call_file
>>>>>
("/home/alok/openllvm/llvm-project_derefval/build.d/david.cc")
>>>>>                   DW_AT_call_line       (10)
>>>>>                   DW_AT_call_column     (0x03)
>>>>>
>>>>> 0x00000088:       DW_TAG_formal_parameter
>>>>>                     DW_AT_location      (indexed (0x0)
loclist >>>>> 0x00000010:
>>>>>                        [0x00000000004004c6,
0x00000000004004d4):
>>>>> DW_OP_explicit_pointer, DW_OP_lit3)
>>>>>                     DW_AT_abstract_origin       (0x00000055
"p")
>>>>> ------------
>>>>>
>>>>> Please note that DW_OP_explicit_pointer denotes that
following value
>>>>> represents de-referenced value of optimized out pointer.
With necessary
>>>>> changes in LLDB debugger this dwarf info can help to detect
the explicit
>>>>> de-referenced value of 'p'.
>>>>>
>>>>> Hi David,
>>>>>
>>>>> Should we keep on working for the above case separately and
resume the
>>>>> review of implicit pointer independently now, which is
updated with many
>>>>> suggestions from this discussion?
>>>>>
>>>>> Regards,
>>>>> Alok
>>>>>
>>>>>
>>>>> On Wed, Nov 20, 2019 at 11:24 PM Jeremy Morse <
>>>>> jeremy.morse.llvm at gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> For a new way of representing things,
>>>>>>
>>>>>> Adrian wrote:
>>>>>> > llvm.dbg.value_new(DILocalVariable("y"),
>>>>>> DIExpression(DW_OP_LLVM_arg0, DW_OP_LLVM_arg1,
DW_OP_plus),
>>>>>> >                    %ptr, %ofs)
>>>>>>
>>>>>> I think this would be great -- there're definitely
some constructs
>>>>>> created by the induction-variables pass and similar
where one could
>>>>>> recover an implicit variable value, if you could for
example subtract
>>>>>> one pointer from another.
>>>>>>
>>>>>> With the current model of storing DIExpressions as a
vector of
>>>>>> opcodes, it might become a pain to salvage a Value that
gets optimised
>>>>>> out --in the example, if %ofs were salvaged, presumably
>>>>>> DW_OP_LLVM_arg1 could have to be replaced with several
extra
>>>>>> operations. This isn't insurmountable, but I've
repeatedly shied away
>>>>>> from scanning through DIExpressions to patch them up. A
vector of
>>>>>> opcodes is the final output of the compiler, IMHO
richer metadata
>>>>>> should be used in the meantime.
>>>>>>
>>>>>> IMHO the implicit pointer work doesn't need to
block on this. As said
>>>>>> my mild preference would be for a new intrinsic for
this form of
>>>>>> variable location.
>>>>>>
>>>>>> ~
>>>>>>
>>>>>> Inre PR37682,
>>>>>>
>>>>>> > I’ve been reminded of PR37682, where a function
with a reference
>>>>>> parameter might spend all its time computing the
“referenced” value in a
>>>>>> temp, and only move the final value back to the
referenced object at the
>>>>>> end.  This is clearly a situation that could benefit
from
>>>>>> DW_OP_implicit_pointer, and there is really no
other-object DIE for it to
>>>>>> refer to.  Given the current spec, the compiler would
need to produce a
>>>>>> DW_TAG_dwarf_procedure for the parameter DIE to refer
to.  Appendix D
>>>>>> (Figure D.61) has an example of this construction,
although it’s a more
>>>>>> contrived source example.
>>>>>>
>>>>>> This has been working through my mind too, and I think
it's slightly
>>>>>> different to what implicit_pointer is trying to
achieve. In the case
>>>>>> implicit_pointer is designed for, it's a strict
improvement in debug
>>>>>> experience because you're recovering information
that couldn't be
>>>>>> expressed. However for PR37682 it's a trade-off
between whether the
>>>>>> user might want to examine the pointer, or the
pointed-at integer:
>>>>>> AFAIUI, we can only express one of the two, not both.
Wheras for
>>>>>> mem2reg'd variables referred to by DIE, there is
never a pointer to be
>>>>>> lost.
>>>>>>
>>>>>> I think my preference would always be to see
temporarily-promoted
>>>>>> values as there's no other way of observing them,
but others might
>>>>>> disagree.
>>>>>>
>>>>>> --
>>>>>> Thanks,
>>>>>> Jeremy
>>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>
>>>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20191223/28fc9df7/attachment.html>

Alok Sharma via llvm-dev

2019-Dec-23 19:03 UTC

head link

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

Hi Paul,

As David already replied about the emergence of
DW_OP_LLVM_explicit_pointer. Let me explain a bit more about it.

In order to address a case David has put regarding a variable pointing to a
temporary (which happens in case of references). For the same case a
solution is already suggested by you (using artificial variable for
temporary).

To make maximum use of the discussion, I tried to provide additional option
to choose from.

Note that this case is not handled even by gnu gcc, So how much gcc does
should be *must* for us and beyond that anything should be *aspire*.

Now to include that aspire case we have two options

1. Create Artificial variable (flip side we need to carry extra artifical
DIE)
2. Define the value inline using DW_OP_LLVM_explicit_pointer (flip side new
operator need to be introduced)

I think we should go ahead with *must* functionality anyway and chose one
of the options for *aspire* functionality.

Regards,
Alok

Since this case


On Thu, Dec 19, 2019 at 9:57 PM Robinson, Paul <paul.robinson at sony.com>
wrote:
> I regret to say I also have not been following this with the attention it
> deserves, and I am pretty much on holiday until 14 January.
>
> I am particularly surprised by the appearance of something called
> DW_OP_LLVM_explicit_pointer, which I wouldn’t have thought necessary and
> don’t remember from the discussions that I did read.
>
> I will try to mend my ways and pay more attention when I return.
>
> --paulr
>
>
>
> *From:* David Blaikie <dblaikie at gmail.com>
> *Sent:* Wednesday, December 18, 2019 6:24 PM
> *To:* Alok Sharma <aloksharma.knit at gmail.com>; Adrian Prantl <
> aprantl at apple.com>; Jonas Devlieghere <jdevlieghere at
apple.com>; Robinson,
> Paul <paul.robinson at sony.com>
> *Cc:* Jeremy Morse <jeremy.morse.llvm at gmail.com>; llvm-dev <
> llvm-dev at lists.llvm.org>; AlokKumar.Sharma at amd.com; Vedant Kumar
<
> vedant_kumar at apple.com>
> *Subject:* Re: [llvm-dev] DW_OP_implicit_pointer design/implementation in
> general
>
>
>
> (I'm still pretty concerned that there are IR changes going in for a
> feature that seems incomplete and more invasive than really seems justified
> to me - though I admit I'm clearly not paying enough attention to this
> feature to have a nuanced/fully informed opinion & so maybe I just need
to
> step back from all of this - but given the addition of new intrinsics, it
> seems like there should be more clear design discussion)
>
>
>
> On Tue, Dec 10, 2019 at 9:06 PM Alok Sharma <aloksharma.knit at
gmail.com>
> wrote:
>
> Hi David,
>
>
>
> This is regarding missing multilevel handling in branch for explicit
> pointers.
>
>
>
> > * does the proposed IR format support multiple layers of dereference
> (eg: int ** where we know it ultimately points to the value 3 but can't
> describe either the first or second level pointers that get to that value)
> - it sounds like any intrinsic that's special cased to deref (like
> llvm.dbg.derefval) wouldn't be able to capture that, which seems like
it's
> overly narrow/special case, then?
>
>
>
> The PoC of DW_OP_LLVM_explicit_pointer does not have handling of
> multilevel indirection. As of now it is so due to below reason.
>
>
>
>  Explicit pointer handles cases when variable points to a temporary which
> contains constant. Due to language standard constraints, we don't find
> pointers in such cases, what we get is references. Unlike pointers,
> references have single level. (reference to reference is just reference
> while pointer to pointer is double pointer).
>
>  Case of reference to reference,  second level can be handled using
> DW_OP_LLVM_explicit_pointer itself.
>
>  Case of pointer to reference, second level can be handled using
> DW_OP_implicit_pointer.
>
>
>
> Though it would not be complex to make explicit pointer multilevel, I
> avoided so due to lack of use case. Please let me know if I am missing
> something.
>
>
> Sorry, I couldn't understand your language related to references and
> pointers - I don't understand why they would be handled differently or
> represent challenges/tradeoffs for features related to collapsed
> indirection like this.
>
> Multi-level indirection seems to have as much use as single level
> indirection. (if a DWARF user may want to know what a pointer points to
> even when what it points to isn't in memory, the same would hold true
for
> pointers to pointers, etc)
>
> I would expect this to be handled with a general OP saying "hey,
I'm
> skipping one level of indirection indirection in the resulting value,
> because that indirection is missing/not in the final program" and that
this
> would be encoded in a llvm.dbg.value/DIExpression as usual, without the
> need for new IR intrinsics, though possibly with the need for an LLVM
> extension DWARF OP (DW_OP_LLVM_explicit_pointer?)
>
> To reconstitute that general form into the current DWARF limited
> "indirection needs to refer to another variable DIE" issue - as I
think
> Paul speculated previously, we could always reconstitute a synthetic
> variable DIE & not try to reflect the case where the indirection lands
at
> another named/known variable - as I expect that's the minority case. In
> most cases in C++ I expect pointers and references do not refer to named
> variables in the same function. They refer to return values from functions,
> they refer to array elements in dynamically allocated arrays, etc, etc.
>
>
>
>
> Regards,
>
> Alok
>
>
>
>
>
> On Fri, Nov 29, 2019 at 10:12 AM Alok Sharma <aloksharma.knit at
gmail.com>
> wrote:
>
> Let me try to summarize the implementation first.
>
>
>
> At the moment, there are two branches.
>
>
>
> 1. When an existing variable is optimized out and that variable is used to
> get the de-refereced value, pointed to by another pointer/reference
> variable.
>
>   Such cases are being addressed using Dwarf expression
> DW_OP_implicit_pointer as de-referenced value of a pointer can be seen
> implicitly (using another variable). Before Dwarf is dumped in LLVM IR, we
> represent it using dbg.derefval (which denotes derefereced value of pointer
> or reference) and DW_OP_LLVM_implicit_pointer operation.
>
>
>
> 2. When a temporary variable is optimized out and that variable is used to
> get de-referenced value of another reference variable (AFAIK it can not be
> reproduced with pointers)
>
>   Such cases are being addressed using new Dwarf expression
> DW_OP_explicit_pointer as de-referenced value can be displayed explicitly
> (in place). In LLVM IR, we represent it using dbg.derefval and
> DW_OP_LLVM_explicit_pointer operation.
>
>
>
> Both of these two branches have some common implementation to define new
> operations (Dwarf and IR). (D70642, D70643, D69999, D69886).
>
> First branch has additional patches (D70260, 70384, D70385, D70419).
>
> Second branch has additional patch ( D70833).
>
>
>
> Let me try to comment on points raised by you.
>
> - Branch 2, (patch D70833) handles cases when temporaries (not existing
> variables) are optimized out.
>
> - In patch D70385, I have included test points to display that multi
> layered pointers are working
> (llvm/test/DebugInfo/dwarfdump-implicit_pointer_mem2reg.c).
>
>
>
> I feel that review of branch 1 (implicit pointer) can be resumed (which
> was halted due to current discussion), while we can continue to discuss
> branch 2 (explicit pointers D7083) if you want. David, what do you think?
>
>
>
> Regards,
>
> Alok
>
>
>
> On Fri, Nov 29, 2019 at 4:40 AM David Blaikie <dblaikie at gmail.com>
wrote:
>
> Sorry I haven't been more engaged with this thread, I have been reading
> it, so hopefully my reply isn't completely out of line/irrelevant - but
I
> still feel like having a custom dwarf expression operator (& no new
> intrinsics), like we have for one or two other DW_OP_LLVM_* (that
aren't
> actually generated into the DWARF - though this one perhaps could be in
> some/all cases as an extension, maybe - or a synthesized variable could be
> created for compatibility with the current DWARF standard) would make the
> most sense.
>
> Some thought experiments that I think are relevant:
> * does the proposed IR format scale to pointers that don't point to
> existing variables (that I think has already been touched on in this
thread)
> * does the proposed IR format support multiple layers of dereference (eg:
> int ** where we know it ultimately points to the value 3 but can't
describe
> either the first or second level pointers that get to that value) - it
> sounds like any intrinsic that's special cased to deref (like
> llvm.dbg.derefval) wouldn't be able to capture that, which seems like
it's
> overly narrow/special case, then?
>
>
>
> On Thu, Nov 28, 2019 at 2:29 PM Alok Sharma via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> Hi folks,
>
>
>
> I am pushing a PoC patch https://reviews.llvm.org/D70833 for review which
> includes the case when temporary is promoted.
>
>
>
> For such cases it generates IR as
>
>
>
>   call void @llvm.dbg.derefval(metadata i32 3, metadata !25, metadata
> !DIExpression(DW_OP_LLVM_explicit_pointer, DW_OP_LLVM_arg0)), !dbg !32
>
>
>
> And llvm-darfdump output looks like
>
>
>
> -------------
>
> 0x0000007b:     DW_TAG_inlined_subroutine
>                   DW_AT_abstract_origin (0x0000004f "_Z4sinkRKi")
>                   DW_AT_low_pc  (0x00000000004004c6)
>                   DW_AT_high_pc (0x00000000004004d0)
>                   DW_AT_call_file
> ("/home/alok/openllvm/llvm-project_derefval/build.d/david.cc")
>                   DW_AT_call_line       (10)
>                   DW_AT_call_column     (0x03)
>
> 0x00000088:       DW_TAG_formal_parameter
>                     DW_AT_location      (indexed (0x0) loclist >
0x00000010:
>                        [0x00000000004004c6, 0x00000000004004d4):
> DW_OP_explicit_pointer, DW_OP_lit3)
>                     DW_AT_abstract_origin       (0x00000055 "p")
>
> ------------
>
>
>
> Please note that DW_OP_explicit_pointer denotes that following value
> represents de-referenced value of optimized out pointer. With necessary
> changes in LLDB debugger this dwarf info can help to detect the explicit
> de-referenced value of 'p'.
>
>
>
> Hi David,
>
>
>
> Should we keep on working for the above case separately and resume the
> review of implicit pointer independently now, which is updated with many
> suggestions from this discussion?
>
>
>
> Regards,
>
> Alok
>
>
>
>
>
> On Wed, Nov 20, 2019 at 11:24 PM Jeremy Morse <jeremy.morse.llvm at
gmail.com>
> wrote:
>
> Hi,
>
> For a new way of representing things,
>
> Adrian wrote:
> > llvm.dbg.value_new(DILocalVariable("y"),
DIExpression(DW_OP_LLVM_arg0,
> DW_OP_LLVM_arg1, DW_OP_plus),
> >                    %ptr, %ofs)
>
> I think this would be great -- there're definitely some constructs
> created by the induction-variables pass and similar where one could
> recover an implicit variable value, if you could for example subtract
> one pointer from another.
>
> With the current model of storing DIExpressions as a vector of
> opcodes, it might become a pain to salvage a Value that gets optimised
> out --in the example, if %ofs were salvaged, presumably
> DW_OP_LLVM_arg1 could have to be replaced with several extra
> operations. This isn't insurmountable, but I've repeatedly shied
away
> from scanning through DIExpressions to patch them up. A vector of
> opcodes is the final output of the compiler, IMHO richer metadata
> should be used in the meantime.
>
> IMHO the implicit pointer work doesn't need to block on this. As said
> my mild preference would be for a new intrinsic for this form of
> variable location.
>
> ~
>
> Inre PR37682,
>
> > I’ve been reminded of PR37682, where a function with a reference
> parameter might spend all its time computing the “referenced” value in a
> temp, and only move the final value back to the referenced object at the
> end.  This is clearly a situation that could benefit from
> DW_OP_implicit_pointer, and there is really no other-object DIE for it to
> refer to.  Given the current spec, the compiler would need to produce a
> DW_TAG_dwarf_procedure for the parameter DIE to refer to.  Appendix D
> (Figure D.61) has an example of this construction, although it’s a more
> contrived source example.
>
> This has been working through my mind too, and I think it's slightly
> different to what implicit_pointer is trying to achieve. In the case
> implicit_pointer is designed for, it's a strict improvement in debug
> experience because you're recovering information that couldn't be
> expressed. However for PR37682 it's a trade-off between whether the
> user might want to examine the pointer, or the pointed-at integer:
> AFAIUI, we can only express one of the two, not both. Wheras for
> mem2reg'd variables referred to by DIE, there is never a pointer to be
> lost.
>
> I think my preference would always be to see temporarily-promoted
> values as there's no other way of observing them, but others might
> disagree.
>
> --
> Thanks,
> Jeremy
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20191224/7abf9b49/attachment-0001.html>

llvm dev - Dec 2019 - DW_OP_implicit_pointer design/implementation in general

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

[llvm-dev] DW_OP_implicit_pointer design/implementation in general