thr3ads.net - llvm dev - [llvm-dev] DW_OP_implicit_pointer design/implementation in general [Jul 2020]

If this information is useful, please help other people find it:
Share via:

David Blaikie via llvm-dev

2020-Jul-21 23:25 UTC

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

Realized I didn't document the original reviews that motivated this thread:

A stack of reviews, split off from here: https://reviews.llvm.org/D69787

Alok's posted a new patch (with smaller patches split off from the
monolithic one) here: https://reviews.llvm.org/D84112

I haven't had a chance to page in all the old context, nor look at the
new ones in detail yet. But probably worth keeping high level design
review here, I think? Once the general direction seems good, we can go
into the separate review threads for the implementation/mechanical
details.

On Tue, Jan 14, 2020 at 8:47 AM David Blaikie <dblaikie at gmail.com>
wrote:>
>
>
> On Mon, Jan 13, 2020 at 8:49 PM Alok Sharma <aloksharma.knit at
gmail.com> wrote:
>>
>> Hi,
>>
>> Let me consolidate what we discussed with my opinion.
>>
>> * On the point of new intrinsic llvm.dbg.derefval:
>> It (new intrinsic) was more a neater way than a needed way. The whole
functionality can go ahead without it and using llvm.dbg.value instead. Though I
liked it (new intrinsic), since most of us are against it, it should be fine for
me to drop it.
>> This is because the transformation was like
>> llvm.dbg.value -> DBG_VALUE -> DWARF location-list
>> llvm.dbg.derefval -> DBG_VALUE -> DWARF location-list
>> Since it was just for better readability in LLVM IR only later it (new
intrinsic) was sharing the same path with llvm.dbg.value. So it should be fine
to drop it without any impact in later functionality.
>> - On question of identify such cases we can anyway identify using the
type of expression (DW_OP_implicit_pointer).
>> - On question of ( DW_OP_implicit_pointer ) fitting to dbg.value
intrinsic it perfectly does that as value in such case is metadata and prototype
of dbg.value is dbg.value(metadata, metadata, metadata).
>> So it should be fine to drop it and back to where it was started before
introduction of new intrinsic.
>>
>> * On the point of handing of pointer pointing to temporary / unnamed
variables (Lets call it Scope S1)
>> As two proposed patches are there for bringing pointers referring to
temporary / unnamed variable
>>    A) first patch uses (new proposed operator)
DW_OP_LLVM_explicit_pointer(both in LLVM-IR and DWARF)
>>    B) Second patch uses artificial variable representing temporary
(both in LLVM-IR and DWARF)
>>          https://reviews.llvm.org/D72055 [DebugInfo] Support for
DW_OP_implicit_pointer (for temp references & dynamic allocations)
>> If I understood David correctly, he wants LLVM-IR look like patch-A and
DWARF look like patch-B (lets call it way C)
>> Since patch-A is not desired because we don't support anything
beyond DWARF-5 and patch proposes new DWARF operator. I want to clarify that
patch-B can exist even without new intrinsic and can use dbg.value and fits
perfectly in existing LLVM-IR template. if only reason to go way-C is to
>> I would like to go way-B or way-C for the scope of unnamed variables.
>>
>> * For the cases when pointer points to named variable (Lets call it
Scope S2):
>> I would update the patches with replacing dbg.derefval to dbg.value and
using DW_OP_implicit_pointer (to named variable) in both LLVM-IR and DWARF.
>>
>> In summary,
>> Scope S1 can be solved with
>> way-B)  DW_OP_implicit_pointer with artificial variable and with
intrinsic dbg.value in LLVM-IR and DWARF
>> or
>> way-C) DW_OP_LLVM_explicit_pointer with intrinsic dbg.value in LLVM-IR
+ DW_OP_implicit_pointer with artificial variable in DWARF
>>
>> Scope S2 can be solved with
>> DW_OP_implicit_pointer with actual named variable with dbg.value in
LLVM-IR and DWARF
>
>
> Based on my current understanding, I'd rather not do this ^ it seems
like added complexity to LLVM optimizations/middle end/IR representation with
limited value. I think doing (C) above but using it to cover both S1 and S2 (ie:
not special casing "pointing to an existing variable" - instead
treating that the same as "pointing to an unnamed entity") initially,
and then, maybe later, evaluating whether referencing named variables from
dbg.value is worth the added benefit, would be the right way to go.
>
>>
>>
>> Regards,
>> Alok
>>
>>
>> Though
>>
>>
>>
>> lets
>>
>> On Tue, Jan 14, 2020 at 1:28 AM Jeremy Morse <jeremy.morse.llvm at
gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> David wrote:
>>> > Are you referring to the possibility of implicit values to
refer to other variables? I'm sort of interested in maybe not doing that -
and only implementing a more general form (what's been talked about with the
LLVM_implicit_value (or was it LLVM_explicit_value? I forget)) - and
synthesizing artificial variables in the backend rather than trying to track
which variable a pointer points to. I think this would keep the impact on
optimizations smaller & would be more general.
>>>
>>> Adrian wrote:
>>> > If it were possible to synthesize it in AsmPrinter, would that
remove the motivation for the new intrinsic for you?
>>>
>>> Ah, yeah, those changes would avoid any need for a new intrinsic to
my
>>> mind, and sounds much more palatable. Thanks for explaining.
>>>
>>> --
>>> Thanks,
>>> Jeremy

Jeremy Morse via llvm-dev

2020-Jul-30 11:59 UTC

head link

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

Hi,

I've taken a look at the patches (thanks Alok) and will submit
comments in a bit,

David wrote:> I haven't had a chance to page in all the old context, nor look at the
> new ones in detail yet. But probably worth keeping high level design
> review here, I think? Once the general direction seems good, we can go
> into the separate review threads for the implementation/mechanical
> details.
I think the latest patch series matches what came out of the
discussion above, as you described it:
> I would expect this to be handled with a general OP saying "hey,
I'm
> skipping one level of indirection indirection in the resulting value,
> because that indirection is missing/not in the final program" and that
this
> would be encoded in a llvm.dbg.value/DIExpression as usual, without the
> need for new IR intrinsics, though possibly with the need for an LLVM
> extension DWARF OP (DW_OP_LLVM_explicit_pointer?)
That's what's been implemented, whenever an alloca is promoted,
variable locations that used the allocas address are transformed into
promoted-value variable locations in the usual way, but with a
DW_OP_LLVM_explicit_pointer at the front of the expression to indicate
"the pointer is absent, but this is what it would have pointed at".
Simple case:

  i32 *%foo = alloca i32
  dbg.declare(%foo, !123, !DIExpression())
  dbg.value(%foo, !456, !DIExpression())
  store i32 0, i32 *%foo

Where !123 is a plain i32 source variable, and !456 a pointer-to-i32
source variable. When %foo is promoted, these would become:

  dbg.value(i32 0, !123, !DIExpression())
  dbg.value(i32 0, !456, !DIExpression(DW_OP_LLVM_explicit_pointer))

When it comes to the IR way of modelling these things, I think that
this matches the discussion, and is a lightweight way of representing
what's going on.

I have some reservations about further down the compiler though:
artificial variables get created at isel time, which seems early to
me, and duplicates the work for each instruction selector. Is there a
reason why it can't be done in the DWARF emitter? The artificial
variables are also tracked with additional DBG_VALUE instructions, if
we could push artificial variable creation back to emission time then
we wouldn't have to answer questions such as "what is the lifetime of
a DBG_VALUE of an artificial variable?"

At promotion time: some of the handling of variable promotion appears
to happen within Instruction::eraseFromParent, which seems out of
place. I reckon you've missed the calls in PromoteMemoryToRegister.cpp
to the ConvertDebugDeclareToDebugValue helpers -- shifting the
promotion handling there would be better, and not dependent on the
order that things are erased in. I think those ConvertDebug... helper
functions and the two other functions you've instrumented in the same
file should be sufficient to catch all promotions.

Additionally, I believe that promoted allocas are getting
DW_OP_LLVM_explicit_pointer dbg.values generated for any pointer that
_ever_ points at it. You'll need to consider circumstances where
pointer variables have multiple values, i.e.:

  int foo, bar, baz;
  int *qux = &foo;
  qux = &bar;
  qux = &baz;
  foo = 1;
  bar = 2;
  baz = 3;

If I understood the code correctly, 'qux' will have implicit-pointer
values for each of the assignments to foo / bar / baz, where it should
only have a dbg.value for the assignment to 'baz'. (It might be
alright to limit handling to scenarios where a pointer variable only
ever has one value, and then expand what can be handled later).

--
Thanks,
Jeremy

David Blaikie via llvm-dev

2020-Sep-24 02:56 UTC

head link

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

[+CC some folks interested in optimized debug info variable locations]

On Thu, Jul 30, 2020 at 5:00 AM Jeremy Morse
<jeremy.morse.llvm at gmail.com> wrote:>
> Hi,
>
> I've taken a look at the patches (thanks Alok) and will submit
> comments in a bit,
Thanks for that! Looks like you've hit a lot of the usual/important
bits, for sure.
>
> David wrote:
> > I haven't had a chance to page in all the old context, nor look at
the
> > new ones in detail yet. But probably worth keeping high level design
> > review here, I think? Once the general direction seems good, we can go
> > into the separate review threads for the implementation/mechanical
> > details.
>
> I think the latest patch series matches what came out of the
> discussion above, as you described it:
>
> > I would expect this to be handled with a general OP saying "hey,
I'm
> > skipping one level of indirection indirection in the resulting value,
> > because that indirection is missing/not in the final program" and
that this
> > would be encoded in a llvm.dbg.value/DIExpression as usual, without
the
> > need for new IR intrinsics, though possibly with the need for an LLVM
> > extension DWARF OP (DW_OP_LLVM_explicit_pointer?)
>
> That's what's been implemented, whenever an alloca is promoted,
> variable locations that used the allocas address are transformed into
> promoted-value variable locations in the usual way, but with a
> DW_OP_LLVM_explicit_pointer at the front of the expression to indicate
> "the pointer is absent, but this is what it would have pointed
at".
> Simple case:
>
>   i32 *%foo = alloca i32
>   dbg.declare(%foo, !123, !DIExpression())
>   dbg.value(%foo, !456, !DIExpression())
>   store i32 0, i32 *%foo
>
> Where !123 is a plain i32 source variable, and !456 a pointer-to-i32
> source variable. When %foo is promoted, these would become:
>
>   dbg.value(i32 0, !123, !DIExpression())
>   dbg.value(i32 0, !456, !DIExpression(DW_OP_LLVM_explicit_pointer))
>
> When it comes to the IR way of modelling these things, I think that
> this matches the discussion, and is a lightweight way of representing
> what's going on.
Awesome. I noticed there's also implicit_pointer as well as
explicit_pointer - what's the distinction there and are the two
concepts related enough to have some shared implementation
concerns/merit shared review?

Also, it looked like (at a glance) the LLVM_explicit (or implicit?)
pointer thing took an integer parameter, which I think is the amount
of indirection (so an int** collapsed down to an int would have a
value of '2' for this int parameter?) - minor discussion about whether
that's worthwhile, or whether we should stack LLVM_explicit_pointers
on top of each other to represent this case?
> I have some reservations about further down the compiler though:
> artificial variables get created at isel time, which seems early to
> me, and duplicates the work for each instruction selector.
Is this consistent with how we do dynamic array bounds (void f1(int i)
{ int x[i]; }), which I think is one of the things that we modeled
this idea from? Or perhaps it is done for array dimensions but isn't
suitable to this new use case as much?
> Is there a
> reason why it can't be done in the DWARF emitter? The artificial
> variables are also tracked with additional DBG_VALUE instructions, if
> we could push artificial variable creation back to emission time then
> we wouldn't have to answer questions such as "what is the lifetime
of
> a DBG_VALUE of an artificial variable?"
This/these sort of questions I'd like to punt to Adrian and other
folks who have had more investment in optimized debug info locations
in the last couple of years... I hope that isn't a cop-out, it's just
these particular aspects are probably not ones I have the most context
on.
> At promotion time: some of the handling of variable promotion appears
> to happen within Instruction::eraseFromParent, which seems out of
> place. I reckon you've missed the calls in PromoteMemoryToRegister.cpp
> to the ConvertDebugDeclareToDebugValue helpers -- shifting the
> promotion handling there would be better, and not dependent on the
> order that things are erased in. I think those ConvertDebug... helper
> functions and the two other functions you've instrumented in the same
> file should be sufficient to catch all promotions.
>
> Additionally, I believe that promoted allocas are getting
> DW_OP_LLVM_explicit_pointer dbg.values generated for any pointer that
> _ever_ points at it. You'll need to consider circumstances where
> pointer variables have multiple values, i.e.:
>
>   int foo, bar, baz;
>   int *qux = &foo;
>   qux = &bar;
>   qux = &baz;
>   foo = 1;
>   bar = 2;
>   baz = 3;
>
> If I understood the code correctly, 'qux' will have
implicit-pointer
> values for each of the assignments to foo / bar / baz, where it should
> only have a dbg.value for the assignment to 'baz'. (It might be
> alright to limit handling to scenarios where a pointer variable only
> ever has one value, and then expand what can be handled later).
Sounds like something to be aware of/see some test cases for, for sure.

Much thanks!
- Dave

llvm dev - Jul 2020 - DW_OP_implicit_pointer design/implementation in general

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

[llvm-dev] DW_OP_implicit_pointer design/implementation in general

[llvm-dev] DW_OP_implicit_pointer design/implementation in general