Tozer, Stephen via llvm-dev
2020-Oct-07 12:38 UTC
[llvm-dev] [Debuginfo] Changing llvm.dbg.value and DBG_VALUE to support multiple location operands
> I don't see how this is a meaningful distinction in LLVM IR. In LLVM IR we only have SSA values. An SSA value could be an alloca, or a gep into an alloca, or spilled onto the stack at the MIR level, in which case the dbg.value should get lowered into a memory location (if it isn't explicitly a DW_OP_stack_value).I think the distinction is still important; even at the IR level, if we have a dbg.value that uses an alloca or something similar, we can still distinguish between "this alloca is the variable's location" versus "this alloca is the variable's value", i.e. the variable itself is a pointer to a local variable. In IR, we implicitly distinguish between these by using dbg.declare/dbg.addr for the former, and dbg.value for the latter. In MIR, we use the indirectness flag instead. DW_OP_LLVM_direct can supplant the latter. Apologies for the somewhat confusing explanation thus far; in the IR stage, I think that actually we wouldn't need to produce DW_OP_LLVM_direct at all (although there's no harm in doing so) as long as we have the existing set of debug variable intrinsics, because "directness" is already made explicit by the choice of intrinsic. Every dbg.value would implicitly be LLVM_direct unless it has another implicit location specifier (such as stack_value or implicit_ptr). This would mean that we could have a debug value: dbg.value(%a, "a", (DW_OP_plus_uconst, 5)), with no stack_value necessary, as opposed to the current case where every dbg.value with a complex expression has stack_value (I believe). As discussed, one of the key distinctions that DW_OP_LLVM_direct is used for is distinguishing between memory and register locations; this is exactly the same as the difference between dbg.addr(%a, "a", ()) and dbg.value(%a, "a", ()). The former would become DBG_VALUE %a, "a", () and the latter would become DBG_VALUE %a, "a", (DW_OP_LLVM_direct).> Do you have an example of a a dbg.value that isn't a DW_OP_stack_value where it makes sense to distinguish between a memory and a register location? > > Perhaps another way to phrase this question — is there a difference between > > dbg.value(my_alloca, var, !DIExpression(DW_OP_deref, DW_OP_LLVM_direct)) > > and > > dbg.value(my_alloca, var, !DIExpression(DW_OP_deref)) ?So with that in mind, we wouldn't need to produce these, as in both cases the intent would be that the value of "var" is at the address given by "my_alloca". When we produce the corresponding DBG_VALUEs for these, both would end with DW_OP_LLVM_direct. This would change if we unified the IR debug intrinsics so that a single intrinsic could represent both memory locations and register/implicit locations, as opposed to the current state where the former can only be represented by dbg.declare/dbg.addr and the latter can only be represented by dbg.value. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201007/11ce3656/attachment.html>
Adrian Prantl via llvm-dev
2020-Oct-07 17:40 UTC
[llvm-dev] [Debuginfo] Changing llvm.dbg.value and DBG_VALUE to support multiple location operands
> On Oct 7, 2020, at 5:38 AM, Tozer, Stephen <stephen.tozer at sony.com> wrote: > > > I don't see how this is a meaningful distinction in LLVM IR. In LLVM IR we only have SSA values. An SSA value could be an alloca, or a gep into an alloca, or spilled onto the stack at the MIR level, in which case the dbg.value should get lowered into a memory location (if it isn't explicitly a DW_OP_stack_value). > > I think the distinction is still important; even at the IR level, if we have a dbg.value that uses an alloca or something similar, we can still distinguish between "this alloca is the variable's location" versus "this alloca is the variable's value", i.e. the variable itself is a pointer to a local variable. In IR, we implicitly distinguish between these by using dbg.declare/dbg.addr for the former, and dbg.value for the latter. In MIR, we use the indirectness flag instead. DW_OP_LLVM_direct can supplant the latter. > > Apologies for the somewhat confusing explanation thus far; in the IR stage, I think that actually we wouldn't need to produce DW_OP_LLVM_direct at all (although there's no harm in doing so) as long as we have the existing set of debug variable intrinsics, because "directness" is already made explicit by the choice of intrinsic.That is exactly what I was trying figure out. My concern is that it would be confusing to add a DWARF expression extension at the LLVM IR level that has not semantic effect. If we don't add a Verifier check that it is consistently applied then inevitably various frontends will start producing either variant of DIExpression. If we do add a Verifier check, then we are effectively adding a pointless extra field to every DIExpression that has no effect. In either case it would be hard to explain to new LLVM developers why this operation exists in LLVM IR. So I wonder if we should instead model this only at the MIR level, where this distinction actually makes sense. In MIR, we probably don't want to rewrite every DIExpression, so it would make sense to model it either as a flag on the intrinsic, or by have two kinds of intrinsics. What do you think about making this a property of MIR instead? -- adrian> Every dbg.value would implicitly be LLVM_direct unless it has another implicit location specifier (such as stack_value or implicit_ptr). This would mean that we could have a debug value: dbg.value(%a, "a", (DW_OP_plus_uconst, 5)), with no stack_value necessary, as opposed to the current case where every dbg.value with a complex expression has stack_value (I believe). > > As discussed, one of the key distinctions that DW_OP_LLVM_direct is used for is distinguishing between memory and register locations; this is exactly the same as the difference between dbg.addr(%a, "a", ()) and dbg.value(%a, "a", ()). The former would become DBG_VALUE %a, "a", () and the latter would become DBG_VALUE %a, "a", (DW_OP_LLVM_direct). > > > Do you have an example of a a dbg.value that isn't a DW_OP_stack_value where it makes sense to distinguish between a memory and a register location? > > > > Perhaps another way to phrase this question — is there a difference between > > > > dbg.value(my_alloca, var, !DIExpression(DW_OP_deref, DW_OP_LLVM_direct)) > > > > and > > > > dbg.value(my_alloca, var, !DIExpression(DW_OP_deref)) ? > > So with that in mind, we wouldn't need to produce these, as in both cases the intent would be that the value of "var" is at the address given by "my_alloca". When we produce the corresponding DBG_VALUEs for these, both would end with DW_OP_LLVM_direct. This would change if we unified the IR debug intrinsics so that a single intrinsic could represent both memory locations and register/implicit locations, as opposed to the current state where the former can only be represented by dbg.declare/dbg.addr and the latter can only be represented by dbg.value.-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201007/5353af45/attachment.html>
Tozer, Stephen via llvm-dev
2020-Oct-08 11:07 UTC
[llvm-dev] [Debuginfo] Changing llvm.dbg.value and DBG_VALUE to support multiple location operands
> So I wonder if we should instead model this only at the MIR level, where this distinction actually makes sense. In MIR, we probably don't want to rewrite every DIExpression, so it would make sense to model it either as a flag on the intrinsic, or by have two kinds of intrinsics.That works for me - as long as we have the ability to represent these expressions, it should be fine. What will be slightly awkward is maintaining this at the same time as the old DBG_VALUE; having two flags on two different but related instructions with the same name and meanings that are almost the same but slightly different. Still, the old version will be deprecated so we shouldn't have to worry too much. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201008/afdb2430/attachment.html>