thr3ads.net - llvm dev - [llvm-dev] [RFC] [DebugInfo] Using DW_OP_entry

If this information is useful, please help other people find it:
Share via:

Jeremy Morse via llvm-dev

2020-Sep-09 15:19 UTC

[llvm-dev] [RFC] [DebugInfo] Using DW_OP_entry_value within LLVM IR

Hi Djordje,

On Wed, Sep 9, 2020 at 7:52 AM Djordje Todorovic
<Djordje.Todorovic at syrmia.com> wrote:> Using entry-values ('callee' side of the feature) is not enough in
any case. It is always connected to the call-site-param (function arguments but
we call it call-site-params; 'caller' side of the feature) debug info. I
believe that there are call-site-params that could be expressed in terms of
DWARF for the cases we face within deadargelim. GCC does perform correct output
for both caller and callee sides for unused params.
Ah, that covers my concerns. This is definitely a worthy cause then --
especially as parameters are usually considered more important to
preserve than other variables.

Djordje> Please share that work when you are ready.
Sure, explanation below: note that I'm bringing this up now because I
see producing entry-value "backup" locations as a technique to recover
from the register allocator clobbering things, and I feel the below is
a more general solution.

I'd like to use this (contrived) code as an illustrative example:

    void ext(long);
    void foo(long *ptr, long bar, long baz) {
      for (long i = 0; i < bar; ++i) {
        long index = baz + i;
        long *curptr = &ptr[index];
        ext(*curptr);
      }
    }

All it does is iterate over a loop, loading values from an offset into
a pointer. I've compiled this at -O2, and then given it an additional
run of -loop-reduce with opt [0]. During optimisation, LLVM rightly
identifies that the 'baz' offset is loop-invariant, and that it can
fold some of the offset calculation into the loop preheader. This then
leads to both 'ptr' and 'baz' being out of liveness, and being
clobbered in the body of the loop. In addition, the 'index' variable
is optimised out too, and that's the variable I'd like to focus on.

Today, we're not able to describe 'index' in the IR after
-loop-reduce, but I'm confident that the variadic variable locations
work will make that possible. I'm going to assume that we can describe
such locations for the rest of this email.

"index" could be described by using the entry value of 'baz'
and
adding it to 'i', which remains in liveness throughout. To produce a
"backup" location though, we would have to guess that 'baz'
would go
out of liveness in advance, and speculatively produce the expression.
I reckon that we can instead calculate the location at end of
compilation by using the SSA-like information from instruction
referencing. Here's the MIR for the reduced loop body, using
instruction-referencing [1] and lightly edited to remove noise, with
only variable locations for the 'i' variable. I've added some
explanatory comments:

    DBG_PHI $rbx, 2
    DBG_INSTR_REF 2, 0, !16, !DIExpression(), debug-location !23
    ;  This is the load from *curptr:
    renamable $rdi = MOV64rm renamable $r15, 8, renamable $rbx
    ; Call to ext,
    CALL64pcrel32 @ext, csr_64, [implicit defs]
    ; Loop increment:
    renamable $rbx = nuw nsw ADD64ri8 killed renamable $rbx, 1,
debug-instr-number 1
    DBG_INSTR_REF 1, 0, !16, !DIExpression(), debug-location !23
    CMP64rr renamable $r14, renamable $rbx, implicit-def $eflags
    JCC_1 %bb.2, 5, implicit $eflags

The label "debug-instr-number 1" on the ADD64ri8 identifies the ADD as
corresponding to the loop increment, and the DBG_PHI for $rbx as the
position where the loop PHI occurs. My key observation is that there
is a one-to-one relationship between LLVM-IR Values and these
end-of-compilation instruction numbers [2]. If we stored a mapping
during instruction selection of Value <=> instruction reference, at
the end of compilation we would be able to salvage variable locations
that had gone out of liveness.

Imagine for a moment that we described the "index" variable as a
variadic variable location, possibly looking like this:

    DBG_INSTR_REF {3, 0}, {2, 0}, !17, !DIExpression(DW_OP_LLVM_arg,
0, DW_OP_LLVM_arg, 1, DW_OP_plus)

Where the {3, 0} instruction number referred to the 'baz' argument,
and {2, 0} the value of 'i' on entry to the loop body. The workflow
for salvaging would look something like this, after LiveDebugValues
has finished doing dataflow things:
  1) Examine instruction reference {3, 0},
  2) Observe that it's out of liveness in the current location (the loop
body),
  3) Look up the LLVM-IR Value that {3, 0} corresponds to, finding the
Argument in LLVM-IR,
  4) Because it's an Argument, replace DW_OP_LLVM_arg, 0 with the
corresponding entry value expression,
  5) Emit variable location.

This is harder than just speculating how we might salvage the location
earlier in compilation, but is more direct, and involves no
un-necessary work. Additionally, it's not limited to entry values: for
any value that goes out of liveness that was computed by a side-effect
free instruction, we could:
  4) For each operand of the corresponding LLVM-IR Instruction,
  4.1) Identify the instruction number of this operand,
  4.2) Confirm that that number is still in liveness (if not: abort),
  5) Compute an expression that recomputes the Value using the
locations of the operands,
  6) Emit variable location.

We could even go the other way and recover a value from other
computations that used the value (if an inverse operation exists).

~

This may sound far-fetched, but I think a lot of the information
necessary to do the above is becoming available. Doing this completely
in general would involve putting instruction references on every
LLVM-IR Value in the function being compiled, which could add an
overhead. Whether that's worth it depends on how many variable
locations could be recovered.

Again, it hinges on not finding something fatal in the instruction
referencing approach to variable locations. Tail duplication is being
miserable, but hasn't thrown up anything fatal yet.

Thanks for listening!

[0] I'm not sure why -loop-reduce wasn't firing during -O2, but
that's
not important.
[1] It's possible with all the patches I've uploaded for review so
far; although I seem to have missed the patch to InstrEmitter.cpp that
labels PHI instructions, I'll try to get that up soon.
[2] Not necessarily true after tail-duplication runs; but I believe
that can be addressed with some minor pain.

--
Thanks,
Jeremy

Djordje Todorovic via llvm-dev

2020-Sep-10 11:04 UTC

head link

[llvm-dev] [RFC] [DebugInfo] Using DW_OP_entry_value within LLVM IR

Hi Jeremy,

Thanks for proposing that.
First of all, I think that all the dbg-instr-ref work can give us a lot of
benefits (since handling of llvm.dbg.value() intrinsic is easier, indeed) when
implemented/committed, so thanks for that.
I completely like the idea you have described, and there are a couple of
questions/comments:

  1.  The entry-values-as-"backups" could be used this way, but we
firstly need to have the DBG_INSTR_REF in use. I don't see an overlapping
with the way I suggested for current "non-ref-dbg-values" at MIR. (?)
  2.  This will be an improvement of the DBG_INSTR_REF, since it needs the
variadic form of the instruction; and we don't have it at the moment? As you
have pointed out, by having the variadic form of the instruction, we can salvage
"non-entry" values as well, I guess.

Best regards,
Djordje

________________________________
From: Jeremy Morse <jeremy.morse.llvm at gmail.com>
Sent: Wednesday, September 9, 2020 5:19 PM
To: Djordje Todorovic <Djordje.Todorovic at syrmia.com>
Cc: David Stenberg <david.stenberg at ericsson.com>; llvm-dev at
lists.llvm.org <llvm-dev at lists.llvm.org>; Nikola Tesic <Nikola.Tesic
at syrmia.com>; Petar Jovanovic <petar.jovanovic at syrmia.com>; ibaev
at cisco.com <ibaev at cisco.com>; asowda at cisco.com <asowda at
cisco.com>
Subject: Re: [llvm-dev] [RFC] [DebugInfo] Using DW_OP_entry_value within LLVM IR

Hi Djordje,

On Wed, Sep 9, 2020 at 7:52 AM Djordje Todorovic
<Djordje.Todorovic at syrmia.com> wrote:> Using entry-values ('callee' side of the feature) is not enough in
any case. It is always connected to the call-site-param (function arguments but
we call it call-site-params; 'caller' side of the feature) debug info. I
believe that there are call-site-params that could be expressed in terms of
DWARF for the cases we face within deadargelim. GCC does perform correct output
for both caller and callee sides for unused params.
Ah, that covers my concerns. This is definitely a worthy cause then --
especially as parameters are usually considered more important to
preserve than other variables.

Djordje> Please share that work when you are ready.
Sure, explanation below: note that I'm bringing this up now because I
see producing entry-value "backup" locations as a technique to recover
from the register allocator clobbering things, and I feel the below is
a more general solution.

I'd like to use this (contrived) code as an illustrative example:

    void ext(long);
    void foo(long *ptr, long bar, long baz) {
      for (long i = 0; i < bar; ++i) {
        long index = baz + i;
        long *curptr = &ptr[index];
        ext(*curptr);
      }
    }

All it does is iterate over a loop, loading values from an offset into
a pointer. I've compiled this at -O2, and then given it an additional
run of -loop-reduce with opt [0]. During optimisation, LLVM rightly
identifies that the 'baz' offset is loop-invariant, and that it can
fold some of the offset calculation into the loop preheader. This then
leads to both 'ptr' and 'baz' being out of liveness, and being
clobbered in the body of the loop. In addition, the 'index' variable
is optimised out too, and that's the variable I'd like to focus on.

Today, we're not able to describe 'index' in the IR after
-loop-reduce, but I'm confident that the variadic variable locations
work will make that possible. I'm going to assume that we can describe
such locations for the rest of this email.

"index" could be described by using the entry value of 'baz'
and
adding it to 'i', which remains in liveness throughout. To produce a
"backup" location though, we would have to guess that 'baz'
would go
out of liveness in advance, and speculatively produce the expression.
I reckon that we can instead calculate the location at end of
compilation by using the SSA-like information from instruction
referencing. Here's the MIR for the reduced loop body, using
instruction-referencing [1] and lightly edited to remove noise, with
only variable locations for the 'i' variable. I've added some
explanatory comments:

    DBG_PHI $rbx, 2
    DBG_INSTR_REF 2, 0, !16, !DIExpression(), debug-location !23
    ;  This is the load from *curptr:
    renamable $rdi = MOV64rm renamable $r15, 8, renamable $rbx
    ; Call to ext,
    CALL64pcrel32 @ext, csr_64, [implicit defs]
    ; Loop increment:
    renamable $rbx = nuw nsw ADD64ri8 killed renamable $rbx, 1,
debug-instr-number 1
    DBG_INSTR_REF 1, 0, !16, !DIExpression(), debug-location !23
    CMP64rr renamable $r14, renamable $rbx, implicit-def $eflags
    JCC_1 %bb.2, 5, implicit $eflags

The label "debug-instr-number 1" on the ADD64ri8 identifies the ADD as
corresponding to the loop increment, and the DBG_PHI for $rbx as the
position where the loop PHI occurs. My key observation is that there
is a one-to-one relationship between LLVM-IR Values and these
end-of-compilation instruction numbers [2]. If we stored a mapping
during instruction selection of Value <=> instruction reference, at
the end of compilation we would be able to salvage variable locations
that had gone out of liveness.

Imagine for a moment that we described the "index" variable as a
variadic variable location, possibly looking like this:

    DBG_INSTR_REF {3, 0}, {2, 0}, !17, !DIExpression(DW_OP_LLVM_arg,
0, DW_OP_LLVM_arg, 1, DW_OP_plus)

Where the {3, 0} instruction number referred to the 'baz' argument,
and {2, 0} the value of 'i' on entry to the loop body. The workflow
for salvaging would look something like this, after LiveDebugValues
has finished doing dataflow things:
  1) Examine instruction reference {3, 0},
  2) Observe that it's out of liveness in the current location (the loop
body),
  3) Look up the LLVM-IR Value that {3, 0} corresponds to, finding the
Argument in LLVM-IR,
  4) Because it's an Argument, replace DW_OP_LLVM_arg, 0 with the
corresponding entry value expression,
  5) Emit variable location.

This is harder than just speculating how we might salvage the location
earlier in compilation, but is more direct, and involves no
un-necessary work. Additionally, it's not limited to entry values: for
any value that goes out of liveness that was computed by a side-effect
free instruction, we could:
  4) For each operand of the corresponding LLVM-IR Instruction,
  4.1) Identify the instruction number of this operand,
  4.2) Confirm that that number is still in liveness (if not: abort),
  5) Compute an expression that recomputes the Value using the
locations of the operands,
  6) Emit variable location.

We could even go the other way and recover a value from other
computations that used the value (if an inverse operation exists).

~

This may sound far-fetched, but I think a lot of the information
necessary to do the above is becoming available. Doing this completely
in general would involve putting instruction references on every
LLVM-IR Value in the function being compiled, which could add an
overhead. Whether that's worth it depends on how many variable
locations could be recovered.

Again, it hinges on not finding something fatal in the instruction
referencing approach to variable locations. Tail duplication is being
miserable, but hasn't thrown up anything fatal yet.

Thanks for listening!

[0] I'm not sure why -loop-reduce wasn't firing during -O2, but
that's
not important.
[1] It's possible with all the patches I've uploaded for review so
far; although I seem to have missed the patch to InstrEmitter.cpp that
labels PHI instructions, I'll try to get that up soon.
[2] Not necessarily true after tail-duplication runs; but I believe
that can be addressed with some minor pain.

--
Thanks,
Jeremy
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200910/c50f8524/attachment-0001.html>

Jeremy Morse via llvm-dev

2020-Sep-10 11:38 UTC

head link

[llvm-dev] [RFC] [DebugInfo] Using DW_OP_entry_value within LLVM IR

Hi Djordje,

On Thu, Sep 10, 2020 at 12:04 PM Djordje Todorovic
<Djordje.Todorovic at syrmia.com> wrote:> The entry-values-as-"backups" could be used this way, but we
firstly need to have the DBG_INSTR_REF in use. I don't see an overlapping
with the way I suggested for current "non-ref-dbg-values" at MIR. (?)
Indeed, there's no overlap between these two ideas -- yours is
producing expressions early and consuming late, while mine is both
producing and consuming late. Hence I wanted to get the idea out for
discussion before either are really pursued.
> This will be an improvement of the DBG_INSTR_REF, since it needs the
variadic form of the instruction; and we don't have it at the moment? As you
have pointed out, by having the variadic form of the instruction, we can salvage
"non-entry" values as well, I guess.
Indeed, although I think it'll be a bit easier than Stephens
DBG_VALUE_LIST implementation, as the locations won't need maintenance
through the rest of CodeGen -- we would only need to generate
DBG_INSTR_REFs with multiple operands, then consider them at the end
of compilation.

A note on timescales, I don't see any of the instruction-referencing
work as likely to be "on by default" any time soon. It'll need
some
comprehensive testing on large binaries, plus GlobalISel and aarch64
support, which I haven't thought about so far. It's worth pointing out
that producing backup entry values is something that could be
implemented and work almost immediately, and deliver benefits in the
next release, wheras the late-salvaging way definitely has a long
horizon.

--
Thanks,
Jeremy

llvm dev - Sep 2020 - [RFC] [DebugInfo] Using DW_OP_entry_value within LLVM IR

[llvm-dev] [RFC] [DebugInfo] Using DW_OP_entry_value within LLVM IR

[llvm-dev] [RFC] [DebugInfo] Using DW_OP_entry_value within LLVM IR

[llvm-dev] [RFC] [DebugInfo] Using DW_OP_entry_value within LLVM IR