thr3ads.net - llvm dev - [llvm-dev] DW_OP_implicit_pointer design/implementation in general [Nov 2019]

If this information is useful, please help other people find it:
Share via:

David Blaikie via llvm-dev

2019-Nov-14 21:33 UTC

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

On Thu, Nov 14, 2019 at 1:27 PM Adrian Prantl <aprantl at apple.com>
wrote:
>
>
> > On Nov 14, 2019, at 1:21 PM, David Blaikie <dblaikie at
gmail.com> wrote:
> >
> > Hey folks,
> >
> > Would you all mind having a bit of a design discussion around the
> feature both at the DWARF level and the LLVM implementation? It seems like
> what's currently being proposed/reviewed (based on the DWARF feature as
> spec'd) is a pretty big change & I'm not sure I understand the
motivation,
> exactly.
> >
> > The core point of my confusion: Why does describing the thing a
pointer
> points to require describing a named variable that it points to? What if it
> doesn't point to a named variable?
>
> Without having looked at the motivational text when the feature was
> proposed to DWARF, my assumption was that this is similar to how bounds for
> variable-length arrays are implemented, where a (potentially) artificial
> variable is created by the compiler in order to have something to refer to.

I /sort/ of see that case as a bit different, because the array type needs
to refer back into the function potentially (to use frame-relative, etc). I
could think of other ways to do that in hindsight (like putting the array
type definition inside the function to begin with & having the count
describe the location directly, for instance).

> In retrospect I find the entire specification of DW_OP_implicit_pointer to
> be strangely specific/limited (why one hard-coded offset instead of an
> arbitrary expression?), but that ship has sailed for DWARF 5 and I'm to
> blame for not voicing that concern earlier.
>
Sure, but we don't have to implement it if we don't find it to be super
useful/worthwhile, right? (if something else would be particularly more
general/useful we could instead implement that as an extension, though of
course there's cost to that in terms of consumer support, etc)

>
>
> -- adrian
>
> >
> > Seems like there should be a way to describe that situation - and that
> doing so would be a more general solution than one limited to only
> describing pointers that point to named variables. And would be a simpler
> implementation in LLVM - without having to deconstruct variables during
> optimizations, etc, to track one variable's value being concretely
related
> to another variable's value.
> >
> > - David
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20191114/2784c5cb/attachment.html>

David Blaikie via llvm-dev

2019-Nov-14 21:42 UTC

head link

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

On Thu, Nov 14, 2019 at 1:33 PM David Blaikie <dblaikie at gmail.com>
wrote:
>
>
> On Thu, Nov 14, 2019 at 1:27 PM Adrian Prantl <aprantl at apple.com>
wrote:
>
>>
>>
>> > On Nov 14, 2019, at 1:21 PM, David Blaikie <dblaikie at
gmail.com> wrote:
>> >
>> > Hey folks,
>> >
>> > Would you all mind having a bit of a design discussion around the
>> feature both at the DWARF level and the LLVM implementation? It seems
like
>> what's currently being proposed/reviewed (based on the DWARF
feature as
>> spec'd) is a pretty big change & I'm not sure I understand
the motivation,
>> exactly.
>> >
>> > The core point of my confusion: Why does describing the thing a
pointer
>> points to require describing a named variable that it points to? What
if it
>> doesn't point to a named variable?
>>
>> Without having looked at the motivational text when the feature was
>> proposed to DWARF, my assumption was that this is similar to how bounds
for
>> variable-length arrays are implemented, where a (potentially)
artificial
>> variable is created by the compiler in order to have something to refer
to.
>
>
> I /sort/ of see that case as a bit different, because the array type needs
> to refer back into the function potentially (to use frame-relative, etc). I
> could think of other ways to do that in hindsight (like putting the array
> type definition inside the function to begin with & having the count
> describe the location directly, for instance).
>
Hey, what'd you know, GCC actually produces what I described:

  DW_TAG_array_type [6] *
    DW_AT_type [DW_FORM_ref4]       (cu + 0x0069 => {0x00000069}
"int")
    DW_AT_sibling [DW_FORM_ref4]    (cu + 0x0083 => {0x00000083})

    DW_TAG_subrange_type [7]
      DW_AT_type [DW_FORM_ref4]     (cu + 0x0083 => {0x00000083} "long
unsigned int")
      DW_AT_upper_bound [DW_FORM_exprloc]   (DW_OP_fbreg -40, DW_OP_deref)

No artificial variable the way Clang does it:

  DW_TAG_subprogram
    DW_TAG_variable [4]
                  DW_AT_location [DW_FORM_exprloc]      (DW_OP_fbreg -24)
                  DW_AT_name [DW_FORM_strp]     ( .debug_str[0x000000a6]
"__vla_expr0")
                  DW_AT_type [DW_FORM_ref4]     (cu + 0x0074 =>
{0x00000074} "long unsigned int")
                  DW_AT_artificial [DW_FORM_flag_present]       (true)

    DW_TAG_variable
      ...
      DW_AT_type [DW_FORM_ref4]     (cu + 0x007b => {0x0000007b}
"int[]")

  DW_TAG_array_type [7] *
    DW_AT_type [DW_FORM_ref4]       (cu + 0x006d => {0x0000006d}
"int")

  DW_TAG_subrange_type [8]
    DW_AT_type [DW_FORM_ref4]     (cu + 0x008b => {0x0000008b}
"__ARRAY_SIZE_TYPE__")
    DW_AT_lower_bound [DW_FORM_data1]     (0x00)
    DW_AT_count [DW_FORM_ref4]    (cu + 0x0051 => {0x00000051})

Might be nice to tidy that up some time. GCC's been doing this even in
DWARF-2 mode (where it just uses FORM_black for the upper bound, and as far
back as GCC 6.0 at least.

>
>> In retrospect I find the entire specification of DW_OP_implicit_pointer
>> to be strangely specific/limited (why one hard-coded offset instead of
an
>> arbitrary expression?), but that ship has sailed for DWARF 5 and
I'm to
>> blame for not voicing that concern earlier.
>>
>
> Sure, but we don't have to implement it if we don't find it to be
super
> useful/worthwhile, right? (if something else would be particularly more
> general/useful we could instead implement that as an extension, though of
> course there's cost to that in terms of consumer support, etc)
>
>
>>
>>
>> -- adrian
>>
>> >
>> > Seems like there should be a way to describe that situation - and
that
>> doing so would be a more general solution than one limited to only
>> describing pointers that point to named variables. And would be a
simpler
>> implementation in LLVM - without having to deconstruct variables during
>> optimizations, etc, to track one variable's value being concretely
related
>> to another variable's value.
>> >
>> > - David
>>
>>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20191114/c70d2d83/attachment.html>

Robinson, Paul via llvm-dev

2019-Nov-14 21:53 UTC

head link

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

My reading of the DWARF issue is that it was fairly specifically designed to
handle the case of a function taking parameters by pointer/reference, which is
then inlined, and the caller is passing local objects rather than other
pointers/references.  So:

void inline_me(foo *ptr) {
 does something with ptr->x or *ptr;
}
void caller() {
  foo actual_obj;
  inline_me(&actual_obj);
}

After inlining, maintaining a pointer to actual_obj might be sub-optimal, but
after a “step in” to inline_me, the user wants to look at an expression spelled
*ptr even though the actual_obj might not have a memory address (because fields
are SROA’d into registers, or whatever).  This is where DW_OP_implicit_pointer
saves the day; *ptr and ptr->x are still evaluatable expressions, which
expressions are secretly indirecting through the DIE for actual_obj.

I think it is not widely applicable outside of that kind of scenario.
--paulr

From: David Blaikie <dblaikie at gmail.com>
Sent: Thursday, November 14, 2019 4:34 PM
To: Adrian Prantl <aprantl at apple.com>
Cc: AlokKumar.Sharma at amd.com; Robinson, Paul <paul.robinson at
sony.com>; Jonas Devlieghere <jdevlieghere at apple.com>; llvm-dev
<llvm-dev at lists.llvm.org>
Subject: Re: DW_OP_implicit_pointer design/implementation in general

On Thu, Nov 14, 2019 at 1:27 PM Adrian Prantl <aprantl at
apple.com<mailto:aprantl at apple.com>> wrote:

> On Nov 14, 2019, at 1:21 PM, David Blaikie <dblaikie at
gmail.com<mailto:dblaikie at gmail.com>> wrote:
>
> Hey folks,
>
> Would you all mind having a bit of a design discussion around the feature
both at the DWARF level and the LLVM implementation? It seems like what's
currently being proposed/reviewed (based on the DWARF feature as spec'd) is
a pretty big change & I'm not sure I understand the motivation, exactly.
>
> The core point of my confusion: Why does describing the thing a pointer
points to require describing a named variable that it points to? What if it
doesn't point to a named variable?
Without having looked at the motivational text when the feature was proposed to
DWARF, my assumption was that this is similar to how bounds for variable-length
arrays are implemented, where a (potentially) artificial variable is created by
the compiler in order to have something to refer to.

I /sort/ of see that case as a bit different, because the array type needs to
refer back into the function potentially (to use frame-relative, etc). I could
think of other ways to do that in hindsight (like putting the array type
definition inside the function to begin with & having the count describe the
location directly, for instance).

In retrospect I find the entire specification of DW_OP_implicit_pointer to be
strangely specific/limited (why one hard-coded offset instead of an arbitrary
expression?), but that ship has sailed for DWARF 5 and I'm to blame for not
voicing that concern earlier.

Sure, but we don't have to implement it if we don't find it to be super
useful/worthwhile, right? (if something else would be particularly more
general/useful we could instead implement that as an extension, though of course
there's cost to that in terms of consumer support, etc)

-- adrian
>
> Seems like there should be a way to describe that situation - and that
doing so would be a more general solution than one limited to only describing
pointers that point to named variables. And would be a simpler implementation in
LLVM - without having to deconstruct variables during optimizations, etc, to
track one variable's value being concretely related to another
variable's value.
>
> - David-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20191114/47d28aa9/attachment.html>

David Blaikie via llvm-dev

2019-Nov-14 22:31 UTC

head link

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

On Thu, Nov 14, 2019 at 1:53 PM Robinson, Paul <paul.robinson at sony.com>
wrote:
> My reading of the DWARF issue is that it was fairly specifically designed
> to handle the case of a function taking parameters by pointer/reference,
> which is then inlined, and the caller is passing local objects rather than
> other pointers/references.  So:
>
>
>
> void inline_me(foo *ptr) {
>
>  does something with ptr->x or *ptr;
>
> }
>
> void caller() {
>
>   foo actual_obj;
>
>   inline_me(&actual_obj);
>
> }
>
>
>
> After inlining, maintaining a pointer to actual_obj might be sub-optimal,
> but after a “step in” to inline_me, the user wants to look at an expression
> spelled *ptr even though the actual_obj might not have a memory address
> (because fields are SROA’d into registers, or whatever).  This is where
> DW_OP_implicit_pointer saves the day; *ptr and ptr->x are still
evaluatable
> expressions, which expressions are secretly indirecting through the DIE for
> actual_obj.
>
>
>
> I think it is not widely applicable outside of that kind of scenario.
>
Any ideas why it wouldn't be more general to handle cases where the
variable isn't named? Such as:

foo source();
void f(foo);
inline void sink(foo* p) {
  f(*p);
}
int main() {
  sink(&source());
}

> --paulr
>
>
>
> *From:* David Blaikie <dblaikie at gmail.com>
> *Sent:* Thursday, November 14, 2019 4:34 PM
> *To:* Adrian Prantl <aprantl at apple.com>
> *Cc:* AlokKumar.Sharma at amd.com; Robinson, Paul <paul.robinson at
sony.com>;
> Jonas Devlieghere <jdevlieghere at apple.com>; llvm-dev <
> llvm-dev at lists.llvm.org>
> *Subject:* Re: DW_OP_implicit_pointer design/implementation in general
>
>
>
>
>
>
>
> On Thu, Nov 14, 2019 at 1:27 PM Adrian Prantl <aprantl at apple.com>
wrote:
>
>
>
> > On Nov 14, 2019, at 1:21 PM, David Blaikie <dblaikie at
gmail.com> wrote:
> >
> > Hey folks,
> >
> > Would you all mind having a bit of a design discussion around the
> feature both at the DWARF level and the LLVM implementation? It seems like
> what's currently being proposed/reviewed (based on the DWARF feature as
> spec'd) is a pretty big change & I'm not sure I understand the
motivation,
> exactly.
> >
> > The core point of my confusion: Why does describing the thing a
pointer
> points to require describing a named variable that it points to? What if it
> doesn't point to a named variable?
>
> Without having looked at the motivational text when the feature was
> proposed to DWARF, my assumption was that this is similar to how bounds for
> variable-length arrays are implemented, where a (potentially) artificial
> variable is created by the compiler in order to have something to refer to.
>
>
> I /sort/ of see that case as a bit different, because the array type needs
> to refer back into the function potentially (to use frame-relative, etc). I
> could think of other ways to do that in hindsight (like putting the array
> type definition inside the function to begin with & having the count
> describe the location directly, for instance).
>
>
> In retrospect I find the entire specification of DW_OP_implicit_pointer to
> be strangely specific/limited (why one hard-coded offset instead of an
> arbitrary expression?), but that ship has sailed for DWARF 5 and I'm to
> blame for not voicing that concern earlier.
>
>
>
> Sure, but we don't have to implement it if we don't find it to be
super
> useful/worthwhile, right? (if something else would be particularly more
> general/useful we could instead implement that as an extension, though of
> course there's cost to that in terms of consumer support, etc)
>
>
>
>
>
> -- adrian
>
> >
> > Seems like there should be a way to describe that situation - and that
> doing so would be a more general solution than one limited to only
> describing pointers that point to named variables. And would be a simpler
> implementation in LLVM - without having to deconstruct variables during
> optimizations, etc, to track one variable's value being concretely
related
> to another variable's value.
> >
> > - David
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20191114/88936668/attachment-0001.html>

llvm dev - Nov 2019 - DW_OP_implicit_pointer design/implementation in general

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

[llvm-dev] DW_OP_implicit_pointer design/implementation in general