thr3ads.net - llvm dev - [llvm-dev] DW_OP_implicit_pointer design/implementation in general [Nov 2019]

If this information is useful, please help other people find it:
Share via:

David Blaikie via llvm-dev

2019-Nov-14 21:21 UTC

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

Hey folks,

Would you all mind having a bit of a design discussion around the feature
both at the DWARF level and the LLVM implementation? It seems like what's
currently being proposed/reviewed (based on the DWARF feature as spec'd) is
a pretty big change & I'm not sure I understand the motivation, exactly.

The core point of my confusion: Why does describing the thing a pointer
points to require describing a named variable that it points to? What if it
doesn't point to a named variable?

Seems like there should be a way to describe that situation - and that
doing so would be a more general solution than one limited to only
describing pointers that point to named variables. And would be a simpler
implementation in LLVM - without having to deconstruct variables during
optimizations, etc, to track one variable's value being concretely related
to another variable's value.

- David
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20191114/9006804c/attachment.html>

Adrian Prantl via llvm-dev

2019-Nov-14 21:26 UTC

head link

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

> On Nov 14, 2019, at 1:21 PM, David Blaikie <dblaikie at gmail.com>
wrote:
> 
> Hey folks,
> 
> Would you all mind having a bit of a design discussion around the feature
both at the DWARF level and the LLVM implementation? It seems like what's
currently being proposed/reviewed (based on the DWARF feature as spec'd) is
a pretty big change & I'm not sure I understand the motivation, exactly.
> 
> The core point of my confusion: Why does describing the thing a pointer
points to require describing a named variable that it points to? What if it
doesn't point to a named variable?
Without having looked at the motivational text when the feature was proposed to
DWARF, my assumption was that this is similar to how bounds for variable-length
arrays are implemented, where a (potentially) artificial variable is created by
the compiler in order to have something to refer to. In retrospect I find the
entire specification of DW_OP_implicit_pointer to be strangely specific/limited
(why one hard-coded offset instead of an arbitrary expression?), but that ship
has sailed for DWARF 5 and I'm to blame for not voicing that concern
earlier.


-- adrian
> 
> Seems like there should be a way to describe that situation - and that
doing so would be a more general solution than one limited to only describing
pointers that point to named variables. And would be a simpler implementation in
LLVM - without having to deconstruct variables during optimizations, etc, to
track one variable's value being concretely related to another
variable's value.
> 
> - David

David Blaikie via llvm-dev

2019-Nov-14 21:33 UTC

head link

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

On Thu, Nov 14, 2019 at 1:27 PM Adrian Prantl <aprantl at apple.com>
wrote:
>
>
> > On Nov 14, 2019, at 1:21 PM, David Blaikie <dblaikie at
gmail.com> wrote:
> >
> > Hey folks,
> >
> > Would you all mind having a bit of a design discussion around the
> feature both at the DWARF level and the LLVM implementation? It seems like
> what's currently being proposed/reviewed (based on the DWARF feature as
> spec'd) is a pretty big change & I'm not sure I understand the
motivation,
> exactly.
> >
> > The core point of my confusion: Why does describing the thing a
pointer
> points to require describing a named variable that it points to? What if it
> doesn't point to a named variable?
>
> Without having looked at the motivational text when the feature was
> proposed to DWARF, my assumption was that this is similar to how bounds for
> variable-length arrays are implemented, where a (potentially) artificial
> variable is created by the compiler in order to have something to refer to.

I /sort/ of see that case as a bit different, because the array type needs
to refer back into the function potentially (to use frame-relative, etc). I
could think of other ways to do that in hindsight (like putting the array
type definition inside the function to begin with & having the count
describe the location directly, for instance).

> In retrospect I find the entire specification of DW_OP_implicit_pointer to
> be strangely specific/limited (why one hard-coded offset instead of an
> arbitrary expression?), but that ship has sailed for DWARF 5 and I'm to
> blame for not voicing that concern earlier.
>
Sure, but we don't have to implement it if we don't find it to be super
useful/worthwhile, right? (if something else would be particularly more
general/useful we could instead implement that as an extension, though of
course there's cost to that in terms of consumer support, etc)

>
>
> -- adrian
>
> >
> > Seems like there should be a way to describe that situation - and that
> doing so would be a more general solution than one limited to only
> describing pointers that point to named variables. And would be a simpler
> implementation in LLVM - without having to deconstruct variables during
> optimizations, etc, to track one variable's value being concretely
related
> to another variable's value.
> >
> > - David
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20191114/2784c5cb/attachment.html>

Jeremy Morse via llvm-dev

2019-Nov-18 16:33 UTC

head link

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

Hi llvm-dev@,

Switching focus to the LLVM implementation, the significant change is
using dbg.value's first operand to refer to a DILocalVariable, rather
than a Value. There's some impedance mismatch here, because all the
documentation (for example in the DbgVariableIntrinsic class)
expresses everything in terms of the variables location, whereas
implicit pointers don't have a location as they represent an extra
level of indirection. This is best demonstrated by the change to
IntrinsicInst.cpp in this patch [0] -- calling getVariableLocation on
any normal dbg.value will return the locations Value, but if it's an
implicit pointer then you'll get the meaningless MetadataAsValue
wrapper back instead. This isn't the variable location, might surprise
existing handlers of dbg.values, and just seems a little off.

I can see why this route has been taken, but by putting a non-Value in
dbg.value's, it really changes what dbg.values represent, a variable
location in the IR. Is there any appetite out there for using a
different intrinsic, something like 'dbg.loc.implicit', instead of
using dbg.value? IMO it would be worthwhile to separate:
 * Debug intrinsics where their position in the IR is important, from
 * Debug intrinsics where both their position in the IR, _and_ a Value
in the IR, are important.
Of which (I think) implicit pointers are the former, and current [2]
dbg.values are the latter. This would also avoid putting
DW_OP_implicit_pointer into expressions in the IR, pre-isel at least.

There's also Vedants suggestion [1] for linking implicit pointer
locations with the dbg.values of the underlying DILocalVariable. I
suspect the presence of control flow might make it difficult (there's
no dbg.phi instruction), but I like the idea of having more explicit
links in the IR, it would be much clearer to interpret what's going
on.

[0] https://reviews.llvm.org/D69999?id=229790
[1] https://reviews.llvm.org/D69886#1736182
[2] Technically dbg.value(undef,...) is the former too, I guess.

--
Thanks,
Jeremy

Adrian Prantl via llvm-dev

2019-Nov-19 17:41 UTC

head link

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

> On Nov 18, 2019, at 8:33 AM, Jeremy Morse <jeremy.morse.llvm at
gmail.com> wrote:
> 
> Hi llvm-dev@,
> 
> Switching focus to the LLVM implementation, the significant change is
> using dbg.value's first operand to refer to a DILocalVariable, rather
> than a Value. There's some impedance mismatch here, because all the
> documentation (for example in the DbgVariableIntrinsic class)
> expresses everything in terms of the variables location, whereas
> implicit pointers don't have a location as they represent an extra
> level of indirection. This is best demonstrated by the change to
> IntrinsicInst.cpp in this patch [0] -- calling getVariableLocation on
> any normal dbg.value will return the locations Value, but if it's an
> implicit pointer then you'll get the meaningless MetadataAsValue
> wrapper back instead. This isn't the variable location, might surprise
> existing handlers of dbg.values, and just seems a little off.
> 
> I can see why this route has been taken, but by putting a non-Value in
> dbg.value's, it really changes what dbg.values represent, a variable
> location in the IR. Is there any appetite out there for using a
> different intrinsic, something like 'dbg.loc.implicit', instead of
> using dbg.value? IMO it would be worthwhile to separate:
> * Debug intrinsics where their position in the IR is important, from
> * Debug intrinsics where both their position in the IR, _and_ a Value
> in the IR, are important.
> Of which (I think) implicit pointers are the former, and current [2]
> dbg.values are the latter. This would also avoid putting
> DW_OP_implicit_pointer into expressions in the IR, pre-isel at least.
> 

On that particular point, I would like to see is a generalization of dbg.value:
Currently llvm.dbg.value binds an SSA value (including constants and undef) and
a DIExpression to a DILocalVariable at a position in the instruction stream.
That first SSA value argument is an implicit first element in the DIExpression.

A more general form would be a more printf-like signature:

llvm.dbg.value(DILocalVariable, DIExpression, ...)

for example

llvm.dbg.value_new(DILocalVariable("x"),
DIExpression(DW_OP_LLVM_arg0), %x)
llvm.dbg.value_new(DILocalVariable("y"), DIExpression(DW_OP_LLVM_arg0,
DW_OP_LLVM_arg1, DW_OP_plus),
                   %ptr, %ofs)
llvm.dbg.value_new(DILocalVariable("z"),
DIExpression(DW_OP_implicit_pointer, DW_OP_LLVM_arg0, 32),
                   DILocalVariable("base"))
llvm.dbg.value_new(DILocalVariable("c"), DIExpression(DW_OP_constu,
1))

The mandatory arguments would be the variable and the expression, and an
arbitrary number of SSA values and potentially other variables.


As far as DW_OP_LLVM_implicit_pointer in particular is concerned, we could also
treat the peculiarities of DW_OP_implicit_pointer as a DWARF implementation
detail, introduce DW_OP_LLVM_implicit_pointer which transforms the top-of-stack
into an implicit pointer (similar to DW_OP_stack_value) and have the DWARF
backend insert an artificial variable on the fly.

LLVM IR:

llvm.dbg.value(%base, DILocalVariable("z"),
DIExpression(DW_OP_LLVM_implicit_pointer))

AsmPrinter would expand this into two DW_TAG_variable tags with one location
(list) entry each.

-- adrian
> There's also Vedants suggestion [1] for linking implicit pointer
> locations with the dbg.values of the underlying DILocalVariable. I
> suspect the presence of control flow might make it difficult (there's
> no dbg.phi instruction), but I like the idea of having more explicit
> links in the IR, it would be much clearer to interpret what's going
> on.
> 
> [0] https://reviews.llvm.org/D69999?id=229790
> [1] https://reviews.llvm.org/D69886#1736182
> [2] Technically dbg.value(undef,...) is the former too, I guess.
> 
> --
> Thanks,
> Jeremy

Possibly Parallel Threads

Search for more apparently analagous threads

llvm dev - Nov 2019 - DW_OP_implicit_pointer design/implementation in general

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

Possibly Parallel Threads