Tozer, Stephen via llvm-dev
2020-Sep-04 10:00 UTC
[llvm-dev] [Debuginfo] Changing llvm.dbg.value and DBG_VALUE to support multiple location operands
> Yeah, because that decision can only be made much later in LLVM in AsmPrinter/DwarfExpression.cpp. > In DWARF, DW_OP_reg(x) is a register l-value, all others can either be l-values or r-values depending on whether there is a DW_OP_stack_value/DW_OP_implicit* at the end.Yes, it might not be clear but that's what I'm trying to say. Out of the non-empty DWARF locations, register and memory locations are l-values, implicit locations are r-values. You can technically use DW_OP_breg in an l-value, but not for register locations. This is why when we have a DBG_VALUE that has a single register location operand with an otherwise empty DIExpression, we need some indicator to determine whether we want to produce the register location [DW_OP_reg] or the memory location [DW_OP_breg] (currently this indicator is the indirectness flag).> I think it would be confusing to talk about registers at the LLVM IR / DIExpression level. "SSA-Values"?I think terminology is a bit difficult here because this work concerns both the llvm.dbg.value intrinsic and the DBG_VALUE instruction, which operate on different kinds of arguments. I think "location operands" is probably the best description for them, since they are operands to a DIExpression which is used to compute the variable location.> I don't think that's correct, because a DW_OP_stack_value is an rvalue. But maybe I misunderstood what you were trying to say. > We should start be defining what DW_OP_stack_value really means in LLVM debug info metadata. I believe it should just mean "r-value".Having given it some more thought, I've changed my mind - I agree that we shouldn't use DW_OP_stack_value in this case, because it would be changing its meaning which is to explicitly declare the expression to be an implicit location/r-value. My current line of thinking is that it would be better to introduce a new operator, named DW_OP_LLVM_direct or something similar, which has the meaning "the variable's exact value is produced by the preceding expression", and would replace DW_OP_stack_value as it is currently used within LLVM. To summarise the logic behind using this operator: LLVM debug info does not need to explicitly care about r-values or l-values before DWARF emission, only whether we're describing a variable's memory location, a variable's exact value, or some other implicit location (such as implicit_pointer). Whether an expression is an r-value or l-value can be trivially determined at the end of the pipeline (addMachineRegExpression already does this). For an expression ending with DW_OP_LLVM_direct: if the preceding expression is only a single register then we emit a register location, if the preceding expression ends with DW_OP_deref then we can remove the deref and emit a memory location, and otherwise we emit the expression with DW_OP_stack_value. In expression syntax it would behave like an implicit operator, in that it can only appear at the end of an expression and is incompatible with any implicit operators, including DW_OP_stack_value. The alternative I see for this is using a flag or a new DIExpression operator that explicitly declares a single register DBG_VALUE to be a register location, while it would otherwise be treated as a memory location, and use stack_value for all other cases. The main reason I prefer the "direct" operator is that LLVM doesn't need to know whether a DIExpression results in an l-value location or an r-value location; it only needs to know how to compute the variable's location and then determine whether that computation resolves to an l-value or r-value at the end. Maintaining two separate representations for stack value locations and register locations when we don't need to is an unnecessary burden, especially when it may be possible for a given dbg.value/DBG_VALUE to switch back and forth between them. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200904/3f07271f/attachment-0001.html>
Adrian Prantl via llvm-dev
2020-Sep-04 15:59 UTC
[llvm-dev] [Debuginfo] Changing llvm.dbg.value and DBG_VALUE to support multiple location operands
> On Sep 4, 2020, at 3:00 AM, Tozer, Stephen <stephen.tozer at sony.com> wrote: > > > Yeah, because that decision can only be made much later in LLVM in AsmPrinter/DwarfExpression.cpp. > > In DWARF, DW_OP_reg(x) is a register l-value, all others can either be l-values or r-values depending on whether there is a DW_OP_stack_value/DW_OP_implicit* at the end. > > Yes, it might not be clear but that's what I'm trying to say. Out of the non-empty DWARF locations, register and memory locations are l-values, implicit locations are r-values. You can technically use DW_OP_breg in an l-value, but not for register locations. This is why when we have a DBG_VALUE that has a single register location operand with an otherwise empty DIExpression, we need some indicator to determine whether we want to produce the register location [DW_OP_reg] or the memory location [DW_OP_breg] (currently this indicator is the indirectness flag). > > > I think it would be confusing to talk about registers at the LLVM IR / DIExpression level. "SSA-Values"? > > I think terminology is a bit difficult here because this work concerns both the llvm.dbg.value intrinsic and the DBG_VALUE instruction, which operate on different kinds of arguments. I think "location operands" is probably the best description for them, since they are operands to a DIExpression which is used to compute the variable location. > > > I don't think that's correct, because a DW_OP_stack_value is an rvalue. But maybe I misunderstood what you were trying to say. > > We should start be defining what DW_OP_stack_value really means in LLVM debug info metadata. I believe it should just mean "r-value". > > Having given it some more thought, I've changed my mind - I agree that we shouldn't use DW_OP_stack_value in this case, because it would be changing its meaning which is to explicitly declare the expression to be an implicit location/r-value. My current line of thinking is that it would be better to introduce a new operator, named DW_OP_LLVM_direct or something similar, which has the meaning "the variable's exact value is produced by the preceding expression", and would replace DW_OP_stack_value as it is currently used within LLVM.Can you elaborate what "direct" means? I'm having trouble understanding what the opposite (a non-exact value) would be.> > To summarise the logic behind using this operator: LLVM debug info does not need to explicitly care about r-values or l-values before DWARF emission,I don't think that statement is correct. Based on the semantics, LLVM IR knows that a dbg.declare is an l-value — the debugger can write to it and the value will be changed when continuing the program execution. It can also decide that a "working copy" of the value, described by a dbg.value is a legit read-only representation of the variable, but can't be written to because, e.g., the value exists in more than one place at once. At the moment we don't make the lvalue/rvalue distinction in LLVM at all. We make an educated guess in AsmPrinter. But that's wrong and something we should strive to fix during this redesigning.> only whether we're describing a variable's memory location, a variable's exact value, or some other implicit location (such as implicit_pointer). Whether an expression is an r-value or l-value can be trivially determined at the end of the pipeline (addMachineRegExpression already does this).As stated above, I don't think we can trivially determine this, because (at least for dbg.values) this info was lost already in LLVM IR. Unless we say the dbg.declare / dbg.value distinction is what determines lvalues vs. rvalues.> > For an expression ending with DW_OP_LLVM_direct: if the preceding expression is only a single register then we emit a register location, if the preceding expression ends with DW_OP_deref then we can remove the deref and emit a memory location, and otherwise we emit the expression with DW_OP_stack_value. In expression syntax it would behave like an implicit operator, in that it can only appear at the end of an expression and is incompatible with any implicit operators, including DW_OP_stack_value. > > The alternative I see for this is using a flag or a new DIExpression operator that explicitly declares a single register DBG_VALUE to be a register location, while it would otherwise be treated as a memory location, and use stack_value for all other cases. The main reason I prefer the "direct" operator is that LLVM doesn't need to know whether a DIExpression results in an l-value location or an r-value location; it only needs to know how to compute the variable's location and then determine whether that computation resolves to an l-value or r-value at the end. Maintaining two separate representations for stack value locations and register locations when we don't need to is an unnecessary burden, especially when it may be possible for a given dbg.value/DBG_VALUE to switch back and forth between them.I do think that your insight that we need one (or more?) additional discriminator of some kind is correct — we just need to find the right semantics for it. thanks, adrian
Tozer, Stephen via llvm-dev
2020-Sep-11 18:12 UTC
[llvm-dev] [Debuginfo] Changing llvm.dbg.value and DBG_VALUE to support multiple location operands
> Can you elaborate what "direct" means? I'm having trouble understanding what the opposite (a non-exact value) would be.Apologies, "exact" was a misleading/incorrect term. By direct, I mean that the expression computes the value of the variable, as opposed to its memory address, or the value that it points to. Within LLVM, where we don't have DW_OP_reg/DW_OP_breg but instead simply refer to a generic SSA value, this could mean either a register location or stack value.> At the moment we don't make the lvalue/rvalue distinction in LLVM at all. We make an educated guess in AsmPrinter. But that's wrong and something we should strive to fix during this redesigning.I think the opposite; I don't believe there's any reason we need to make the explicit lvalue/rvalue distinction until we're writing DWARF. To put it in more general terms, I think that the IR/MIR debug value instructions should only care about how the variable's value can be computed. Whether the result of that computation is an lvalue is unimportant within LLVM itself as far as I can tell, and is redundant when it can be computed from just the DIExpression and location operands.>As stated above, I don't think we can trivially determine this, because (at least for dbg.values) this info was lost already in LLVM IR. Unless we say the dbg.declare / dbg.value distinction is what determines lvalues vs. rvalues.With the proposed operator, it would be trivial to determine lvalue vs rvalue debug values with a set of rules (ignoring any fragment operator, which may appear at the end but does not affect the location type): 1. If the expression is empty, or any location arguments are $noreg => Empty 2. If the expression ends with DW_OP_implicit_ptr => Implicit pointer (rvalue) 3. If the expression ends with DW_OP_stack_value =>Stack value (rvalue) // LLVM should produce LLVM_direct instead. 4. If the expression ends with DW_OP_LLVM_direct, then... 4a. If the preceding expression is just DW_OP_LLVM_arg, 0 and the only location operand is a register => Register location (lvalue) 4b. Otherwise => Stack value (rvalue) 5. Otherwise => Memory location (lvalue) This covers all the expected cases without ambiguity or almost any reduced expressiveness. I believe that the only expression that LLVM will not be able to produce like this is DW_OP_bregN, DW_OP_stack_value due to fact that when DW_OP_LLVM_direct is used, this would be written as a register location instead of a stack value. I don't think there are any cases where we would choose to emit a stack value location when we're able to produce a register location instead, so this shouldn't be a problem. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200911/4a430ed5/attachment.html>