thr3ads.net - llvm dev - [llvm-dev] [Debuginfo] Changing llvm.dbg.value and DBG_VALUE to support multiple location operands [Oct 2020]

If this information is useful, please help other people find it:
Share via:

Tozer, Stephen via llvm-dev

2020-Oct-06 12:13 UTC

[llvm-dev] [Debuginfo] Changing llvm.dbg.value and DBG_VALUE to support multiple location operands

> I can see how that could potentially be useful. I'm not sure how often
we could practically make use of a situation like this, but I understand your
motivation.
Indeed, I don't expect us to cancel out DWARF expressions like that very
often. Although that edge case is likely to be very rare, the _direct operator
itself will appear very frequently, as it would be used for every DBG_VALUE that
represents a register location. This allows us to represent register locations
in a way that doesn't rely on flags outside of the DIExpression, doesn't
require changes to be made to the flag/DIExpression if the register is RAUWd by
a constant or other value, and has a clear definition that doesn't clash
with anything in the DWARF spec. Supporting the no-op DIExpression reduction is
unlikely to have a huge impact in itself, but having a "stack_value that
could be an l-value" nicely rounds out the LLVM representation for debug
values.
>If we had DW_OP_LLVM_direct: what would be the semantics of
>
>DIExpression(DW_OP_constu, 4, DW_OP_minus, DW_OP_LLVM_direct)
>
>versus
>
>DIExpression(DW_OP_constu, 4, DW_OP_minus) ?
Once we have the _direct operator, which will be used for all register locations
and some implicit locations, we can safely say that any expression that
isn't _direct, implicit, or empty will be a memory location. So for the
first expression we would check to see if it could be emitted as a register
location, and when that fails we emit a stack value:

DW_OP_breg7 RSP+0, DW_OP_constu 4, DW_OP_minus, DW_OP_stack_value

Since the second expression is not LLVM_direct, stack_value, implicit_ptr, or
any other explicitly declared location type, then it must be a memory location,
so we emit:

DW_OP_breg7 RSP+0, DW_OP_constu 4, DW_OP_minus
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20201006/e4389654/attachment-0001.html>

Adrian Prantl via llvm-dev

2020-Oct-06 20:42 UTC

head link

[llvm-dev] [Debuginfo] Changing llvm.dbg.value and DBG_VALUE to support multiple location operands

> On Oct 6, 2020, at 5:13 AM, Tozer, Stephen <stephen.tozer at
sony.com> wrote:
> 
> > I can see how that could potentially be useful. I'm not sure how
often we could practically make use of a situation like this, but I understand
your motivation.
> 
> Indeed, I don't expect us to cancel out DWARF expressions like that
very often. Although that edge case is likely to be very rare, the _direct
operator itself will appear very frequently, as it would be used for every
DBG_VALUE that represents a register location. This allows us to represent
register locations in a way that doesn't rely on flags outside of the
DIExpression, doesn't require changes to be made to the flag/DIExpression if
the register is RAUWd by a constant or other value, and has a clear definition
that doesn't clash with anything in the DWARF spec. Supporting the no-op
DIExpression reduction is unlikely to have a huge impact in itself, but having a
"stack_value that could be an l-value" nicely rounds out the LLVM
representation for debug values.
That makes sense.
> >If we had DW_OP_LLVM_direct: what would be the semantics of 
> >
> >DIExpression(DW_OP_constu, 4, DW_OP_minus, DW_OP_LLVM_direct)
> >
> >versus
> >
> >DIExpression(DW_OP_constu, 4, DW_OP_minus) ?
> 
> Once we have the _direct operator, which will be used for all register
locations and some implicit locations, we can safely say that any expression
that isn't _direct, implicit, or empty will be a memory location.
I don't see how this is a meaningful distinction in LLVM IR. In LLVM IR we
only have SSA values. An SSA value could be an alloca, or a gep into an alloca,
or spilled onto the stack at the MIR level, in which case the dbg.value should
get lowered into a memory location (if it isn't explicitly a
DW_OP_stack_value). Do you have an example of a a dbg.value that isn't a
DW_OP_stack_value where it makes sense to distinguish between a memory and a
register location?

Perhaps another way to phrase this question — is there a difference between

dbg.value(my_alloca, var, !DIExpression(DW_OP_deref, DW_OP_LLVM_direct))

and

dbg.value(my_alloca, var, !DIExpression(DW_OP_deref)) ?

thanks,
Adrian
> So for the first expression we would check to see if it could be emitted as
a register location, and when that fails we emit a stack value:
> 
> DW_OP_breg7 RSP+0, DW_OP_constu 4, DW_OP_minus, DW_OP_stack_value
> 
> Since the second expression is not LLVM_direct, stack_value, implicit_ptr,
or any other explicitly declared location type, then it must be a memory
location, so we emit:
> 
> DW_OP_breg7 RSP+0, DW_OP_constu 4, DW_OP_minus
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20201006/ec78c916/attachment.html>

Tozer, Stephen via llvm-dev

2020-Oct-07 12:38 UTC

head link

[llvm-dev] [Debuginfo] Changing llvm.dbg.value and DBG_VALUE to support multiple location operands

> I don't see how this is a meaningful distinction in LLVM IR. In LLVM IR
we only have SSA values. An SSA value could be an alloca, or a gep into an
alloca, or spilled onto the stack at the MIR level, in which case the dbg.value
should get lowered into a memory location (if it isn't explicitly a
DW_OP_stack_value).
I think the distinction is still important; even at the IR level, if we have a
dbg.value that uses an alloca or something similar, we can still distinguish
between "this alloca is the variable's location" versus "this
alloca is the variable's value", i.e. the variable itself is a pointer
to a local variable. In IR, we implicitly distinguish between these by using
dbg.declare/dbg.addr for the former, and dbg.value for the latter. In MIR, we
use the indirectness flag instead. DW_OP_LLVM_direct can supplant the latter.

Apologies for the somewhat confusing explanation thus far; in the IR stage, I
think that actually we wouldn't need to produce DW_OP_LLVM_direct at all
(although there's no harm in doing so) as long as we have the existing set
of debug variable intrinsics, because "directness" is already made
explicit by the choice of intrinsic. Every dbg.value would implicitly be
LLVM_direct unless it has another implicit location specifier (such as
stack_value or implicit_ptr). This would mean that we could have a debug value:
dbg.value(%a, "a", (DW_OP_plus_uconst, 5)), with no stack_value
necessary, as opposed to the current case where every dbg.value with a complex
expression has stack_value (I believe).

As discussed, one of the key distinctions that DW_OP_LLVM_direct is used for is
distinguishing between memory and register locations; this is exactly the same
as the difference between dbg.addr(%a, "a", ()) and dbg.value(%a,
"a", ()). The former would become DBG_VALUE %a, "a", () and
the latter would become DBG_VALUE %a, "a", (DW_OP_LLVM_direct).
> Do you have an example of a a dbg.value that isn't a DW_OP_stack_value
where it makes sense to distinguish between a memory and a register location?
>
> Perhaps another way to phrase this question — is there a difference between
>
> dbg.value(my_alloca, var, !DIExpression(DW_OP_deref, DW_OP_LLVM_direct))
>
> and
>
> dbg.value(my_alloca, var, !DIExpression(DW_OP_deref)) ?
So with that in mind, we wouldn't need to produce these, as in both cases
the intent would be that the value of "var" is at the address given by
"my_alloca". When we produce the corresponding DBG_VALUEs for these,
both would end with DW_OP_LLVM_direct. This would change if we unified the IR
debug intrinsics so that a single intrinsic could represent both memory
locations and register/implicit locations, as opposed to the current state where
the former can only be represented by dbg.declare/dbg.addr and the latter can
only be represented by dbg.value.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20201007/11ce3656/attachment.html>

llvm dev - Oct 2020 - [Debuginfo] Changing llvm.dbg.value and DBG_VALUE to support multiple location operands

[llvm-dev] [Debuginfo] Changing llvm.dbg.value and DBG_VALUE to support multiple location operands

[llvm-dev] [Debuginfo] Changing llvm.dbg.value and DBG_VALUE to support multiple location operands

[llvm-dev] [Debuginfo] Changing llvm.dbg.value and DBG_VALUE to support multiple location operands