Tozer, Stephen via llvm-dev
2020-Oct-06 12:13 UTC
[llvm-dev] [Debuginfo] Changing llvm.dbg.value and DBG_VALUE to support multiple location operands
> I can see how that could potentially be useful. I'm not sure how often we could practically make use of a situation like this, but I understand your motivation.Indeed, I don't expect us to cancel out DWARF expressions like that very often. Although that edge case is likely to be very rare, the _direct operator itself will appear very frequently, as it would be used for every DBG_VALUE that represents a register location. This allows us to represent register locations in a way that doesn't rely on flags outside of the DIExpression, doesn't require changes to be made to the flag/DIExpression if the register is RAUWd by a constant or other value, and has a clear definition that doesn't clash with anything in the DWARF spec. Supporting the no-op DIExpression reduction is unlikely to have a huge impact in itself, but having a "stack_value that could be an l-value" nicely rounds out the LLVM representation for debug values.>If we had DW_OP_LLVM_direct: what would be the semantics of > >DIExpression(DW_OP_constu, 4, DW_OP_minus, DW_OP_LLVM_direct) > >versus > >DIExpression(DW_OP_constu, 4, DW_OP_minus) ?Once we have the _direct operator, which will be used for all register locations and some implicit locations, we can safely say that any expression that isn't _direct, implicit, or empty will be a memory location. So for the first expression we would check to see if it could be emitted as a register location, and when that fails we emit a stack value: DW_OP_breg7 RSP+0, DW_OP_constu 4, DW_OP_minus, DW_OP_stack_value Since the second expression is not LLVM_direct, stack_value, implicit_ptr, or any other explicitly declared location type, then it must be a memory location, so we emit: DW_OP_breg7 RSP+0, DW_OP_constu 4, DW_OP_minus -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201006/e4389654/attachment-0001.html>
Adrian Prantl via llvm-dev
2020-Oct-06 20:42 UTC
[llvm-dev] [Debuginfo] Changing llvm.dbg.value and DBG_VALUE to support multiple location operands
> On Oct 6, 2020, at 5:13 AM, Tozer, Stephen <stephen.tozer at sony.com> wrote: > > > I can see how that could potentially be useful. I'm not sure how often we could practically make use of a situation like this, but I understand your motivation. > > Indeed, I don't expect us to cancel out DWARF expressions like that very often. Although that edge case is likely to be very rare, the _direct operator itself will appear very frequently, as it would be used for every DBG_VALUE that represents a register location. This allows us to represent register locations in a way that doesn't rely on flags outside of the DIExpression, doesn't require changes to be made to the flag/DIExpression if the register is RAUWd by a constant or other value, and has a clear definition that doesn't clash with anything in the DWARF spec. Supporting the no-op DIExpression reduction is unlikely to have a huge impact in itself, but having a "stack_value that could be an l-value" nicely rounds out the LLVM representation for debug values.That makes sense.> >If we had DW_OP_LLVM_direct: what would be the semantics of > > > >DIExpression(DW_OP_constu, 4, DW_OP_minus, DW_OP_LLVM_direct) > > > >versus > > > >DIExpression(DW_OP_constu, 4, DW_OP_minus) ? > > Once we have the _direct operator, which will be used for all register locations and some implicit locations, we can safely say that any expression that isn't _direct, implicit, or empty will be a memory location.I don't see how this is a meaningful distinction in LLVM IR. In LLVM IR we only have SSA values. An SSA value could be an alloca, or a gep into an alloca, or spilled onto the stack at the MIR level, in which case the dbg.value should get lowered into a memory location (if it isn't explicitly a DW_OP_stack_value). Do you have an example of a a dbg.value that isn't a DW_OP_stack_value where it makes sense to distinguish between a memory and a register location? Perhaps another way to phrase this question — is there a difference between dbg.value(my_alloca, var, !DIExpression(DW_OP_deref, DW_OP_LLVM_direct)) and dbg.value(my_alloca, var, !DIExpression(DW_OP_deref)) ? thanks, Adrian> So for the first expression we would check to see if it could be emitted as a register location, and when that fails we emit a stack value: > > DW_OP_breg7 RSP+0, DW_OP_constu 4, DW_OP_minus, DW_OP_stack_value > > Since the second expression is not LLVM_direct, stack_value, implicit_ptr, or any other explicitly declared location type, then it must be a memory location, so we emit: > > DW_OP_breg7 RSP+0, DW_OP_constu 4, DW_OP_minus-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201006/ec78c916/attachment.html>
Tozer, Stephen via llvm-dev
2020-Oct-07 12:38 UTC
[llvm-dev] [Debuginfo] Changing llvm.dbg.value and DBG_VALUE to support multiple location operands
> I don't see how this is a meaningful distinction in LLVM IR. In LLVM IR we only have SSA values. An SSA value could be an alloca, or a gep into an alloca, or spilled onto the stack at the MIR level, in which case the dbg.value should get lowered into a memory location (if it isn't explicitly a DW_OP_stack_value).I think the distinction is still important; even at the IR level, if we have a dbg.value that uses an alloca or something similar, we can still distinguish between "this alloca is the variable's location" versus "this alloca is the variable's value", i.e. the variable itself is a pointer to a local variable. In IR, we implicitly distinguish between these by using dbg.declare/dbg.addr for the former, and dbg.value for the latter. In MIR, we use the indirectness flag instead. DW_OP_LLVM_direct can supplant the latter. Apologies for the somewhat confusing explanation thus far; in the IR stage, I think that actually we wouldn't need to produce DW_OP_LLVM_direct at all (although there's no harm in doing so) as long as we have the existing set of debug variable intrinsics, because "directness" is already made explicit by the choice of intrinsic. Every dbg.value would implicitly be LLVM_direct unless it has another implicit location specifier (such as stack_value or implicit_ptr). This would mean that we could have a debug value: dbg.value(%a, "a", (DW_OP_plus_uconst, 5)), with no stack_value necessary, as opposed to the current case where every dbg.value with a complex expression has stack_value (I believe). As discussed, one of the key distinctions that DW_OP_LLVM_direct is used for is distinguishing between memory and register locations; this is exactly the same as the difference between dbg.addr(%a, "a", ()) and dbg.value(%a, "a", ()). The former would become DBG_VALUE %a, "a", () and the latter would become DBG_VALUE %a, "a", (DW_OP_LLVM_direct).> Do you have an example of a a dbg.value that isn't a DW_OP_stack_value where it makes sense to distinguish between a memory and a register location? > > Perhaps another way to phrase this question — is there a difference between > > dbg.value(my_alloca, var, !DIExpression(DW_OP_deref, DW_OP_LLVM_direct)) > > and > > dbg.value(my_alloca, var, !DIExpression(DW_OP_deref)) ?So with that in mind, we wouldn't need to produce these, as in both cases the intent would be that the value of "var" is at the address given by "my_alloca". When we produce the corresponding DBG_VALUEs for these, both would end with DW_OP_LLVM_direct. This would change if we unified the IR debug intrinsics so that a single intrinsic could represent both memory locations and register/implicit locations, as opposed to the current state where the former can only be represented by dbg.declare/dbg.addr and the latter can only be represented by dbg.value. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201007/11ce3656/attachment.html>