Tozer, Stephen via llvm-dev
2020-Feb-21  13:24 UTC
[llvm-dev] [RFC] Allowing debug intrinsics to reference multiple SSA Values
What would it look like without this extension? If we modeled it as if all the
register values were already on the stack (an extension of the current way where
the singular value is modeled as being already on the stack, if I understand it
correctly?)?
If it's decided that the best approach is to introduce something like
DW_OP_LLVM_register - might be worth migrating to that first (basically adding
DW_OP_LLVM_register(0) at the start of every DIExpression) and then expanding it
from unary to n-ary support
Putting the register values initially on the stack reduces the verbosity, though
it could complicate successive salvages of variadic DIExpressions - if any value
other than the last needs salvaging, then you have to use DWARF stack operations
to move it to the top of the stack. For example, if the elements are pushed in
order so that the last element is on the top of the stack:
%c = mul 3, %a
%d = add 5, %b
dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_plus), %c, %d)
; Salvage %d
dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_plus_constu, 5,
DW_OP_plus), %c, %b)
; Salvage %c needs to use DW_OP_swap
dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_plus_constu, 5,
DW_OP_swap, DW_OP_constu, 3, DW_OP_mul, DW_OP_plus), %a, %b)
Or, written out as the stack state:
[%a, %b]               ; Initial state
[%a, %b + 5]           ; DW_OP_plus_constu, 5
[%b + 5, %a]           ; DW_OP_swap
[%b + 5, %a, 3]        ; DW_OP_constu, 3
[%b + 5, %a * 3]       ; DW_OP_mul
[(%b + 5) + (%a * 3)]  ; DW_OP_plus
The simplest and most general solution would be to use DW_OP_pick, which
duplicates the stack element at a given index to the top of the stack. This is
more or less the same as using DW_OP_LLVM_register - both of them take an index
corresponding to a register value - with a few differences: we get to maintain
the default concise dbg.value with a single argument and empty DIExpression, but
salvaging becomes more brittle. If we rely on elements being on the stack in
their declared order, then we need to guarantee that nothing modifies those
stack elements. The cleanest implementation of this would be for
SalvageDebugInfo to salvage expressions normally when the last register is
salvaged, and switch to using DW_OP_pick for every register whenever any other
register is salvaged:
%c = add %a, 5
%e = div %c, %d
; x = %b * ((%a + 5) / %d)
dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_mul), %b, %e)
; Salvage %e; last operator so salvage normally
dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_mul, DW_OP_div),
%b, %c, %d)
; Salvage %c; not last operator and expression isn't already using pick, so
add DW_OP_pick for registers
dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_pick, 0,
DW_OP_pick, 1, DW_OP_plus_constu, 5, DW_OP_pick, 2, DW_OP_div, DW_OP_mul), %b,
%a, %d)
The major advantage of putting all registers on the stack is that it reduces
verbosity and also doesn't require us to implement the new operator or
update existing DIExpressions to contain it, which is useful. Personally I lean
towards keeping things simple and consistent, and so using the pick/register
operator in every expression rather than only the ones that need it, but
there's a good case for either I think.
________________________________
From: David Blaikie <dblaikie at gmail.com>
Sent: 20 February 2020 20:21
To: Tozer, Stephen <stephen.tozer at sony.com>; Adrian Prantl <aprantl
at apple.com>; Jonas Devlieghere <jdevlieghere at apple.com>; Robinson,
Paul <paul.robinson at sony.com>; Eric Christopher <echristo at
gmail.com>
Cc: llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org>
Subject: Re: [llvm-dev] [RFC] Allowing debug intrinsics to reference multiple
SSA Values
(+usual debug info folks)
I'm mostly staying out of discussions around optimized debug info/variable
locations - other folks have more state on this than I do. But some casual
thoughts from the peanut gallery...
On Thu, Feb 20, 2020 at 8:54 AM Tozer, Stephen via llvm-dev <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
Currently, the debug intrinsic functions each have 3 arguments: an SSA value
representing either the address or Value of a local variable, a DILocalVariable,
and a complex expression. If the SSA value is an Instruction, and that
Instruction is at some point deleted, we attempt to salvage the SSA value by
recreating the instruction within the complex expression. If the instruction
cannot be replicated by a complex expression, then the variable location is
dropped causing a reduction in coverage. One of the key restrictions on this
process at the moment is that each intrinsic can only reference a single SSA
value; a numeric constant may be represented within the expression itself,
allowing for binary operators with a constant operand to be salvaged:
%c = add i32 %a, 4
llvm.dbg.value(metadata i32 %c, DILocalVariable("x"), DIExpression())
; Salvage...
llvm.dbg.value(metadata i32 %a, DILocalVariable("x"),
DIExpression(DW_OP_constu, 4, DW_OP_plus))
This proposal is to allow multiple SSA value references within a debug
intrinsic, allowing binary operators with non-constant operands to be salvaged.
This is a long-awaited feature, with an open bugzilla[0] and support from
members of the community[1][2]. To implement this change, each debug intrinsic
will contain a list of SSA values instead of just one, and a new operator will
be added: DW_OP_LLVM_register, which takes as its only argument the index of an
SSA value within the intrinsic’s list, and pushes that SSA value onto the
expression stack.
What would it look like without this extension? If we modeled it as if all the
register values were already on the stack (an extension of the current way where
the singular value is modeled as being already on the stack, if I understand it
correctly?)?
If it's decided that the best approach is to introduce something like
DW_OP_LLVM_register - might be worth migrating to that first (basically adding
DW_OP_LLVM_register(0) at the start of every DIExpression) and then expanding it
from unary to n-ary support.
Two proposed syntaxes for the list of SSA values - though suitable alternatives
may be worth considering - are to either replace the first argument of the
intrinsic function with an MDNode containing the SSA values as operands, or to
remove the first argument and make the intrinsic function variadic, passing the
SSA value list as vargs:
%c = add i32 %a, %b
llvm.dbg.value(metadata i32 %c, DILocalVariable("x"), DIExpression())
; Salvage...
llvm.dbg.value(!{metadata i32 %a, metadata i32 %b},
DILocalVariable("x"), DIExpression(DW_OP_LLVM_register, 0,
DW_OP_LLVM_register, 1, DW_OP_plus))
; Alternatively, the intrinsic function could be made variadic...
llvm.dbg.value(DILocalVariable("x"), DIExpression(DW_OP_LLVM_register,
0, DW_OP_LLVM_register, 1, DW_OP_plus), metadata i32 %a, metadata i32 %b)
The new operator DW_OP_LLVM_register would exist in the IR and MIR, and further
down the pipeline be replaced by the appropriate operator for the target debug
output. For example, when producing DWARF this would be replaced by
DW_OP_regval_type, which pushes the contents of a given register interpreted as
a value of a given type onto the DWARF expression stack.
This has the potential to allow salvaging in a much greater number of cases than
is currently possible. There are also potential follow-up tasks, such as
allowing the salvaging of conditional values, that would further improve debug
variable availability. The following table gives, for several of the
multi-source application projects in the test suite, the number of successful
invocations of SalvageDebugInfo, and the number of failed salvages for each type
of unsalvageable instruction:
            Success             Variadic Binops     Variadic GEPs       Cmp
Insts           Select Insts        Load Insts          Phi Nodes          
Alloca Insts        Call Insts
ALAC        261                 29                  61                  0       
0                   1                   12                  0                  
0
Burg        50                  1                   9                   0       
1                   95                  6                   0                  
0
hbd         514                 16                  1                   0       
3                   45                  10                  0                  
4
Lua         270                 12                  54                  0       
12                  46                  32                  1                  
0
minisat     458                 10                  10                  3       
1                   35                  4                   0                  
0
sgefa       439                 1                   121                 0       
20                  14                  55                  0                  
0
SIBsim4     153                 15                  6                   0       
3                   40                  3                   0                  
0
siod        112                 2                   1                   0       
0                   11                  5                   2                  
1
SPASS       1241                70                  27                  27      
27                  2114                156                 0                  
7
spiff       39                  0                   15                  0       
0                   7                   2                   1                  
0
sqlite3     2322                94                  167                 6       
37                  1143                136                 4                  
10
treecc      127                 1                   0                   0       
1                   350                 37                  0                  
1
viterbi     7                   0                   1                   0       
1                   1                   0                   0                  
0
Of these categories, the first 3 will become salvageable after this work is
implemented, and Select Insts will also be salvageable with some follow-up work
to enable conditional branching in complex expressions. The remainder are not
salvageable in general, although it's possible that the specific passes that
delete those instructions may be able to preserve the debug info as long as the
code isn't totally dead.
[0] https://bugs.llvm.org/show_bug.cgi?id=39141
[1] https://reviews.llvm.org/D51976#1237060
[2] http://lists.llvm.org/pipermail/llvm-dev/2019-November/137021.html
_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200221/3e8b1978/attachment.html>
Djordje Todorovic via llvm-dev
2020-Feb-24  09:14 UTC
[llvm-dev] [RFC] Allowing debug intrinsics to reference multiple SSA Values
Hi Stephen, This makes sense to me. The Salvage Debug Info functionality is missing the chance to salvage a lot of cases where the optimized-out instructions manipulate on multiple SSA values. Currently, it supports only expressions with a single SSA value and a constant. This could have good impact on debug optimized code and debug location coverage. DW_OP_pick should be something like DW_OP_LLVM_pick, but I think it is an implementation detail, so it might be too early for such comment. Best, Djordje On 21.2.20. 14:24, Tozer, Stephen via llvm-dev wrote:> What would it look like without this extension? If we modeled it as if all the register values were already on the stack (an extension of the current way where the singular value is modeled as being already on the stack, if I understand it correctly?)? > > If it's decided that the best approach is to introduce something like DW_OP_LLVM_register - might be worth migrating to that first (basically adding DW_OP_LLVM_register(0) at the start of every DIExpression) and then expanding it from unary to n-ary support > > Putting the register values initially on the stack reduces the verbosity, though it could complicate successive salvages of variadic DIExpressions - if any value other than the last needs salvaging, then you have to use DWARF stack operations to move it to the top of the stack. For example, if the elements are pushed in order so that the last element is on the top of the stack: > > %c = mul 3, %a > %d = add 5, %b > dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_plus), %c, %d) > ; Salvage %d > dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_plus_constu, 5, DW_OP_plus), %c, %b) > ; Salvage %c needs to use DW_OP_swap > dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_plus_constu, 5, DW_OP_swap, DW_OP_constu, 3, DW_OP_mul, DW_OP_plus), %a, %b) > > Or, written out as the stack state: > [%a, %b] ; Initial state > [%a, %b + 5] ; DW_OP_plus_constu, 5 > [%b + 5, %a] ; DW_OP_swap > [%b + 5, %a, 3] ; DW_OP_constu, 3 > [%b + 5, %a * 3] ; DW_OP_mul > [(%b + 5) + (%a * 3)] ; DW_OP_plus > > The simplest and most general solution would be to use DW_OP_pick, which duplicates the stack element at a given index to the top of the stack. This is more or less the same as using DW_OP_LLVM_register - both of them take an index corresponding to a register value - with a few differences: we get to maintain the default concise dbg.value with a single argument and empty DIExpression, but salvaging becomes more brittle. If we rely on elements being on the stack in their declared order, then we need to guarantee that nothing modifies those stack elements. The cleanest implementation of this would be for SalvageDebugInfo to salvage expressions normally when the last register is salvaged, and switch to using DW_OP_pick for every register whenever any other register is salvaged: > > %c = add %a, 5 > %e = div %c, %d > ; x = %b * ((%a + 5) / %d) > dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_mul), %b, %e) > ; Salvage %e; last operator so salvage normally > dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_mul, DW_OP_div), %b, %c, %d) > ; Salvage %c; not last operator and expression isn't already using pick, so add DW_OP_pick for registers > dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_pick, 0, DW_OP_pick, 1, DW_OP_plus_constu, 5, DW_OP_pick, 2, DW_OP_div, DW_OP_mul), %b, %a, %d) > > The major advantage of putting all registers on the stack is that it reduces verbosity and also doesn't require us to implement the new operator or update existing DIExpressions to contain it, which is useful. Personally I lean towards keeping things simple and consistent, and so using the pick/register operator in every expression rather than only the ones that need it, but there's a good case for either I think. > > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > *From:* David Blaikie <dblaikie at gmail.com> > *Sent:* 20 February 2020 20:21 > *To:* Tozer, Stephen <stephen.tozer at sony.com>; Adrian Prantl <aprantl at apple.com>; Jonas Devlieghere <jdevlieghere at apple.com>; Robinson, Paul <paul.robinson at sony.com>; Eric Christopher <echristo at gmail.com> > *Cc:* llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org> > *Subject:* Re: [llvm-dev] [RFC] Allowing debug intrinsics to reference multiple SSA Values > > (+usual debug info folks) > > I'm mostly staying out of discussions around optimized debug info/variable locations - other folks have more state on this than I do. But some casual thoughts from the peanut gallery... > > On Thu, Feb 20, 2020 at 8:54 AM Tozer, Stephen via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: > > Currently, the debug intrinsic functions each have 3 arguments: an SSA value representing either the address or Value of a local variable, a DILocalVariable, and a complex expression. If the SSA value is an Instruction, and that Instruction is at some point deleted, we attempt to salvage the SSA value by recreating the instruction within the complex expression. If the instruction cannot be replicated by a complex expression, then the variable location is dropped causing a reduction in coverage. One of the key restrictions on this process at the moment is that each intrinsic can only reference a single SSA value; a numeric constant may be represented within the expression itself, allowing for binary operators with a constant operand to be salvaged:____ > > __ __ > > %c = add i32 %a, 4____ > > llvm.dbg.value(metadata i32 %c, DILocalVariable("x"), DIExpression())____ > > ; Salvage...____ > > llvm.dbg.value(metadata i32 %a, DILocalVariable("x"), DIExpression(DW_OP_constu, 4, DW_OP_plus))____ > > __ __ > > This proposal is to allow multiple SSA value references within a debug intrinsic, allowing binary operators with non-constant operands to be salvaged. This is a long-awaited feature, with an open bugzilla[0] and support from members of the community[1][2]. To implement this change, each debug intrinsic will contain a list of SSA values instead of just one, and a new operator will be added: DW_OP_LLVM_register, which takes as its only argument the index of an SSA value within the intrinsic’s list, and pushes that SSA value onto the expression stack. > > > What would it look like without this extension? If we modeled it as if all the register values were already on the stack (an extension of the current way where the singular value is modeled as being already on the stack, if I understand it correctly?)? > > If it's decided that the best approach is to introduce something like DW_OP_LLVM_register - might be worth migrating to that first (basically adding DW_OP_LLVM_register(0) at the start of every DIExpression) and then expanding it from unary to n-ary support. > > > Two proposed syntaxes for the list of SSA values - though suitable alternatives may be worth considering - are to either replace the first argument of the intrinsic function with an MDNode containing the SSA values as operands, or to remove the first argument and make the intrinsic function variadic, passing the SSA value list as vargs:____ > > __ __ > > %c = add i32 %a, %b____ > > llvm.dbg.value(metadata i32 %c, DILocalVariable("x"), DIExpression())____ > > ; Salvage...____ > > llvm.dbg.value(!{metadata i32 %a, metadata i32 %b}, DILocalVariable("x"), DIExpression(DW_OP_LLVM_register, 0, DW_OP_LLVM_register, 1, DW_OP_plus))____ > > ; Alternatively, the intrinsic function could be made variadic...____ > > llvm.dbg.value(DILocalVariable("x"), DIExpression(DW_OP_LLVM_register, 0, DW_OP_LLVM_register, 1, DW_OP_plus), metadata i32 %a, metadata i32 %b)____ > > __ __ > > The new operator DW_OP_LLVM_register would exist in the IR and MIR, and further down the pipeline be replaced by the appropriate operator for the target debug output. For example, when producing DWARF this would be replaced by DW_OP_regval_type, which pushes the contents of a given register interpreted as a value of a given type onto the DWARF expression stack.____ > > __ __ > > This has the potential to allow salvaging in a much greater number of cases than is currently possible. There are also potential follow-up tasks, such as allowing the salvaging of conditional values, that would further improve debug variable availability. The following table gives, for several of the multi-source application projects in the test suite, the number of successful invocations of SalvageDebugInfo, and the number of failed salvages for each type of unsalvageable instruction:____ > > __ __ > > Success Variadic Binops Variadic GEPs Cmp Insts Select Insts Load Insts Phi Nodes Alloca Insts Call Insts____ > > ALAC 261 29 61 0 0 1 12 0 0____ > > Burg 50 1 9 0 1 95 6 0 0____ > > hbd 514 16 1 0 3 45 10 0 4____ > > Lua 270 12 54 0 12 46 32 1 0____ > > minisat 458 10 10 3 1 35 4 0 0____ > > sgefa 439 1 121 0 20 14 55 0 0____ > > SIBsim4 153 15 6 0 3 40 3 0 0____ > > siod 112 2 1 0 0 11 5 2 1____ > > SPASS 1241 70 27 27 27 2114 156 0 7____ > > spiff 39 0 15 0 0 7 2 1 0____ > > sqlite3 2322 94 167 6 37 1143 136 4 10____ > > treecc 127 1 0 0 1 350 37 0 1____ > > viterbi 7 0 1 0 1 1 0 0 0____ > > __ __ > > __ __ > > Of these categories, the first 3 will become salvageable after this work is implemented, and Select Insts will also be salvageable with some follow-up work to enable conditional branching in complex expressions. The remainder are not salvageable in general, although it's possible that the specific passes that delete those instructions may be able to preserve the debug info as long as the code isn't totally dead.____ > > __ __ > > [0] https://bugs.llvm.org/show_bug.cgi?id=39141____ > > [1] https://reviews.llvm.org/D51976#1237060____ > > [2] http://lists.llvm.org/pipermail/llvm-dev/2019-November/137021.html____ > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >
Adrian Prantl via llvm-dev
2020-Feb-24  17:45 UTC
[llvm-dev] [RFC] Allowing debug intrinsics to reference multiple SSA Values
> On Feb 21, 2020, at 5:24 AM, Tozer, Stephen <stephen.tozer at sony.com> wrote: > > What would it look like without this extension? If we modeled it as if all the register values were already on the stack (an extension of the current way where the singular value is modeled as being already on the stack, if I understand it correctly?)? > > If it's decided that the best approach is to introduce something like DW_OP_LLVM_register - might be worth migrating to that first (basically adding DW_OP_LLVM_register(0) at the start of every DIExpression) and then expanding it from unary to n-ary support > Putting the register values initially on the stack reduces the verbosity, though it could complicate successive salvages of variadic DIExpressions - if any value other than the last needs salvaging, then you have to use DWARF stack operations to move it to the top of the stack. For example, if the elements are pushed in order so that the last element is on the top of the stack: > > %c = mul 3, %a > %d = add 5, %b > dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_plus), %c, %d) > ; Salvage %d > dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_plus_constu, 5, DW_OP_plus), %c, %b) > ; Salvage %c needs to use DW_OP_swap > dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_plus_constu, 5, DW_OP_swap, DW_OP_constu, 3, DW_OP_mul, DW_OP_plus), %a, %b) > > Or, written out as the stack state: > [%a, %b] ; Initial state > [%a, %b + 5] ; DW_OP_plus_constu, 5 > [%b + 5, %a] ; DW_OP_swap > [%b + 5, %a, 3] ; DW_OP_constu, 3 > [%b + 5, %a * 3] ; DW_OP_mul > [(%b + 5) + (%a * 3)] ; DW_OP_plus > > The simplest and most general solution would be to use DW_OP_pick, which duplicates the stack element at a given index to the top of the stack. This is more or less the same as using DW_OP_LLVM_register - both of them take an index corresponding to a register value - with a few differences: we get to maintain the default concise dbg.value with a single argument and empty DIExpression, but salvaging becomes more brittle. If we rely on elements being on the stack in their declared order, then we need to guarantee that nothing modifies those stack elements. The cleanest implementation of this would be for SalvageDebugInfo to salvage expressions normally when the last register is salvaged, and switch to using DW_OP_pick for every register whenever any other register is salvaged: > > %c = add %a, 5 > %e = div %c, %d > ; x = %b * ((%a + 5) / %d) > dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_mul), %b, %e) > ; Salvage %e; last operator so salvage normally > dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_mul, DW_OP_div), %b, %c, %d) > ; Salvage %c; not last operator and expression isn't already using pick, so add DW_OP_pick for registers > dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_pick, 0, DW_OP_pick, 1, DW_OP_plus_constu, 5, DW_OP_pick, 2, DW_OP_div, DW_OP_mul), %b, %a, %d) > > The major advantage of putting all registers on the stack is that it reduces verbosity and also doesn't require us to implement the new operator or update existing DIExpressions to contain it, which is useful. Personally I lean towards keeping things simple and consistent, and so using the pick/register operator in every expression rather than only the ones that need it, but there's a good case for either I think.As the person who has advocated for DW_OP_LLVM_arg(N) before, my main motivation was to resolve the ambiguity of constant DIExpressions: As a worst-case example: dbg.value(%undef, !DILocalVariable(x), DIExpression(DW_OP_constu, 42)) Is this undefined, or constant 42? But if we make dbg.value fully variadic with all parameters pushed to the stack ahead of time, we can distinguish between dbg.value(!DILocalVariable(x), DIExpression(DW_OP_constu, 42)) ; a constant dbg.value(i32 42, !DILocalVariable(x), DIExpression()) ; the same constant dbg.value(%undef, !DILocalVariable(x), DIExpression(DW_OP_constu, 42)) ; undef + garbage leftover expression If we do it this way and allow 0-n SSA values then I'm in support of the push-first scheme. As a side note, updating dbg.values in salvageDebugInfo when we allow multiple SSA values per dbg.value will be interesting, but possible. We'll need to put the value we need to update at the top of the stack, prepend the salvage expression and then restore the order of arguments on the stack that the rest of the expression expects. -- adrian
David Blaikie via llvm-dev
2020-Feb-24  18:35 UTC
[llvm-dev] [RFC] Allowing debug intrinsics to reference multiple SSA Values
This ambiguity could also be resolved without the "push everything" approach, though, right? If the format supported zero operands - then the expression would be evaluated with an empty stack (rather than the implicit first element stack - since there's no element). It'd make the "treat an operand list of length 1 specially, by pushing it onto the stack before evaluating the DWARF expression" look even more special/less tidy, but not totally outside the realm of possibility. (I don't have super strong feelings here - I like consistency, so I like the idea of pushing all the operands or using an extension op to access them all - but I can see the value in special casing the 1 operand case to maintain the behavior/syntax it has today) On Mon, Feb 24, 2020 at 9:45 AM Adrian Prantl <aprantl at apple.com> wrote:> > > > On Feb 21, 2020, at 5:24 AM, Tozer, Stephen <stephen.tozer at sony.com> > wrote: > > > > What would it look like without this extension? If we modeled it as if > all the register values were already on the stack (an extension of the > current way where the singular value is modeled as being already on the > stack, if I understand it correctly?)? > > > > If it's decided that the best approach is to introduce something like > DW_OP_LLVM_register - might be worth migrating to that first (basically > adding DW_OP_LLVM_register(0) at the start of every DIExpression) and then > expanding it from unary to n-ary support > > Putting the register values initially on the stack reduces the > verbosity, though it could complicate successive salvages of variadic > DIExpressions - if any value other than the last needs salvaging, then you > have to use DWARF stack operations to move it to the top of the stack. For > example, if the elements are pushed in order so that the last element is on > the top of the stack: > > > > %c = mul 3, %a > > %d = add 5, %b > > dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_plus), %c, %d) > > ; Salvage %d > > dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_plus_constu, 5, > DW_OP_plus), %c, %b) > > ; Salvage %c needs to use DW_OP_swap > > dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_plus_constu, 5, > DW_OP_swap, DW_OP_constu, 3, DW_OP_mul, DW_OP_plus), %a, %b) > > > > Or, written out as the stack state: > > [%a, %b] ; Initial state > > [%a, %b + 5] ; DW_OP_plus_constu, 5 > > [%b + 5, %a] ; DW_OP_swap > > [%b + 5, %a, 3] ; DW_OP_constu, 3 > > [%b + 5, %a * 3] ; DW_OP_mul > > [(%b + 5) + (%a * 3)] ; DW_OP_plus > > > > The simplest and most general solution would be to use DW_OP_pick, which > duplicates the stack element at a given index to the top of the stack. This > is more or less the same as using DW_OP_LLVM_register - both of them take > an index corresponding to a register value - with a few differences: we get > to maintain the default concise dbg.value with a single argument and empty > DIExpression, but salvaging becomes more brittle. If we rely on elements > being on the stack in their declared order, then we need to guarantee that > nothing modifies those stack elements. The cleanest implementation of this > would be for SalvageDebugInfo to salvage expressions normally when the last > register is salvaged, and switch to using DW_OP_pick for every register > whenever any other register is salvaged: > > > > %c = add %a, 5 > > %e = div %c, %d > > ; x = %b * ((%a + 5) / %d) > > dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_mul), %b, %e) > > ; Salvage %e; last operator so salvage normally > > dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_mul, DW_OP_div), > %b, %c, %d) > > ; Salvage %c; not last operator and expression isn't already using pick, > so add DW_OP_pick for registers > > dbg.value(!DILocalVariable("x"), !DIExpression(DW_OP_pick, 0, > DW_OP_pick, 1, DW_OP_plus_constu, 5, DW_OP_pick, 2, DW_OP_div, DW_OP_mul), > %b, %a, %d) > > > > The major advantage of putting all registers on the stack is that it > reduces verbosity and also doesn't require us to implement the new operator > or update existing DIExpressions to contain it, which is useful. Personally > I lean towards keeping things simple and consistent, and so using the > pick/register operator in every expression rather than only the ones that > need it, but there's a good case for either I think. > > > As the person who has advocated for DW_OP_LLVM_arg(N) before, my main > motivation was to resolve the ambiguity of constant DIExpressions: As a > worst-case example: > > dbg.value(%undef, !DILocalVariable(x), DIExpression(DW_OP_constu, 42)) > > Is this undefined, or constant 42? > > But if we make dbg.value fully variadic with all parameters pushed to the > stack ahead of time, we can distinguish between > > dbg.value(!DILocalVariable(x), DIExpression(DW_OP_constu, 42)) ; a constant > dbg.value(i32 42, !DILocalVariable(x), DIExpression()) ; the same constant > dbg.value(%undef, !DILocalVariable(x), DIExpression(DW_OP_constu, 42)) ; > undef + garbage leftover expression > > If we do it this way and allow 0-n SSA values then I'm in support of the > push-first scheme. > > As a side note, updating dbg.values in salvageDebugInfo when we allow > multiple SSA values per dbg.value will be interesting, but possible. We'll > need to put the value we need to update at the top of the stack, prepend > the salvage expression and then restore the order of arguments on the stack > that the rest of the expression expects. > > -- adrian-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200224/ee189630/attachment.html>
Tozer, Stephen via llvm-dev
2020-Feb-25  16:39 UTC
[llvm-dev] [RFC] Allowing debug intrinsics to reference multiple SSA Values
>As the person who has advocated for DW_OP_LLVM_arg(N) before, my main motivation was to resolve the ambiguity of constant DIExpressions: As a worst-case example: > >dbg.value(%undef, !DILocalVariable(x), DIExpression(DW_OP_constu, 42)) > >Is this undefined, or constant 42? > >But if we make dbg.value fully variadic with all parameters pushed to the stack ahead of time, we can distinguish between > >dbg.value(!DILocalVariable(x), DIExpression(DW_OP_constu, 42)) ; a constant >dbg.value(i32 42, !DILocalVariable(x), DIExpression()) ; the same constant dbg.value(%undef, !DILocalVariable(x), DIExpression(DW_OP_constu, 42)) ; undef + garbage leftover expression > >If we do it this way and allow 0-n SSA values then I'm in support of the push-first scheme.This seems like as good a case as any to go with the push-first approach, since it solves a meaningful problem that the alternative does not.>As a side note, updating dbg.values in salvageDebugInfo when we allow multiple SSA values per dbg.value will be interesting, but possible. We'll need to put the value we need to update at the top of the stack, prepend the salvage expression and then restore the order of arguments on the stack that the rest of the expression expects.The difficulty here is that in DWARF5, there is no operator that can move an argument up the stack; there are specific operators that rotate the top 2 or 3 elements, but outside of that the only thing you can do is copy an element to the top with DW_OP_pick. Of course we don't have to stick to actual DWARF operators by any means, but I don't think there's any way to eventually express an indexed swap/move operator in DWARF either: you can duplicate an element from anywhere on the stack to the top, but you can't delete the original element or move the top element back down again. Because of this, it is necessary (I believe) to use DW_OP_pick in the general case even if all the arguments are initially pushed onto the stack, due to the limits on how we can manipulate the stack. Of course there are many expressions that don't require DW_OP_pick to evaluate, but for consistency's sake I think it's best to use it instead of the swap or rotate operators whenever any of them is needed. Regardless of whether we push everything immediately or not, having an operator that pushes the SSA value at a given argument index onto the stack seems necessary - if we push elements initially then DW_OP_pick (or DW_OP_LLVM_pick), otherwise DW_OP_LLVM_register. In either of these cases, salvaging is simple: we simply append the salvage expression after the salvaged value's operator. It also translates cleanly to DWARF, since we can directly replace the operator in question with DW_OP_regval_type, using whichever register holds the value we're looking for as an argument, without needing to modify or even consider the rest of the expression at all.
Apparently Analagous Threads
- [RFC] Allowing debug intrinsics to reference multiple SSA Values
- [RFC] Allowing debug intrinsics to reference multiple SSA Values
- Defining the DIExpression operator DW_OP_LLVM_arg
- Defining the DIExpression operator DW_OP_LLVM_arg
- Defining the DIExpression operator DW_OP_LLVM_arg