Brian Gesiak via llvm-dev
2020-Feb-12 05:22 UTC
[llvm-dev] Why is lldb telling me "variable not available"?
Apologies for the slow response here Jeremy. Your reply has been incredibly helpful so far, I just need to try adding 'llvm.dbg.addr' myself to confirm that works. Thank you! - Brian Gesiak On Thu, Feb 6, 2020 at 11:04 AM Jeremy Morse <jeremy.morse.llvm at gmail.com> wrote:> Hi Brian, > > Thanks for working on coroutines, the debugging experience, and in > particular thanks for the comprehensive write-up!, > > On Thu, Feb 6, 2020 at 1:19 PM Brian Gesiak via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > Specifically, I’m trying to improve lldb’s behavior when showing > > variables in the current stack frame, when that frame corresponds to a > > coroutine function. > > [...] > > Everything in the IR appears correct to my eyes, although I know next > to nothing about coroutines and might have missed something. The > simplest explanation of why the variable location goes missing can be > seen in the disassembly: > > > ``` > > 0x401885 <+373>: movq -0x8(%rbp), %rax > > 0x401889 <+377>: movl $0x0, 0x40(%rax) > > 0x401890 <+384>: X movl 0x28(%rax), %edx > > 0x401893 <+387>: X addl $0x1, %edx > > 0x401896 <+390>: X movl %edx, 0x28(%rax) > > 0x401899 <+393>: X movl 0x40(%rax), %edx > > 0x40189c <+396>: addl $0x1, %edx > > 0x40189f <+399>: movl %edx, 0x40(%rax) > > -> 0x4018a2 <+402>: movl 0x28(%rax), %esi > > ``` > > Where I've marked with 'X' before the mnemonic the instructions that > the variable location list covers. The location of "i" is correctly > given as edx from its load to its store, and ends when edx is > overwritten with the value of "j". In all the rest of the code, the > variables value is in memory, and the DWARF data doesn't record this. > > Ideally debug info would track variables when they're stored to memory > -- however we don't automatically know whether any subsequent store to > memory will overwrite that variable, and so we don't track locations > into memory. PR40628 [0] is an example of what can go wrong, where we > described a variable as being in memory, but didn't know when that > location was overwritten. > > If whatever's producing the coroutine IR has guarantees about where > and when variables are loaded/stored from/to memory, it should be > possible to put more information into the IR, so that the rest of LLVM > doesn't have to guess. For example, this portion of IR: > > %15 = load i32, i32* %i.reload.addr62, align 4, !dbg !670 > call void @llvm.dbg.value(metadata i32 %15, metadata !659, metadata > !DIExpression()), !dbg !661 > %inc19 = add nsw i32 %15, 1, !dbg !670 > call void @llvm.dbg.value(metadata i32 %inc19, metadata !659, > metadata !DIExpression()), !dbg !661 > store i32 %inc19, i32* %i.reload.addr62, align 4, !dbg !670 > > Could have a call to llvm.dbg.addr(metadata i32 *%i.reload.addr66, > ...) inserted after the store, indicating that the variable is located > in memory. This should work (TM) so long as that memory is never > overwritten with something that isn't the current value of "i" on > every path after the call to llvm.dbg.addr; and on every path after > the call to llvm.dbg.addr, when the variable is loaded form memory, > there's a call to llvm.dbg.value to indicate that the variable is > located somewhere other than memory now. > > Providing that extra information should improve the location coverage > for your example, certainly when unoptimised. However, I believe (80%) > this method isn't safe against optimisation, because (for example) > dead stores can be deleted by LLVM passes without deleting the call to > llvm.dbg.addr, pointing the variable location at a stale value in > memory. Unfortunately I'm not aware of a facility or technique that > protects against this right now. (CC Reid who I think ran into this > last?). > > Note that there's some support for tracking variables through stack > spills in post-isel debug data passes, however those loads and stores > operate in well defined ways, and general loads and stores might not. > > [0] https://bugs.llvm.org/show_bug.cgi?id=40628 > > -- > Thanks, > Jeremy >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200212/9b4054a6/attachment.html>
Brian Gesiak via llvm-dev
2020-Feb-25 19:42 UTC
[llvm-dev] Why is lldb telling me "variable not available"?
Thanks all, especially Jeremy, for your help.> On Thu, Feb 6, 2020 at 11:04 AM Jeremy Morse <jeremy.morse.llvm at gmail.com> wrote: >> Everything in the IR appears correct to my eyes, although I know next >> to nothing about coroutines and might have missed something.Yes, good point. I think a better explanation of the coroutine passes is in order. The coro-split pass first uses 'llvm::LowerDbgDeclare' to replace all llvm.dbg.declare for %i with llvm.dbg.value, and then replaces all uses of the '%i = alloca i32' instruction with a getelementptr instruction like this one: %i.reload.addr63 = getelementptr inbounds %_Z3foov.Frame, %_Z3foov.Frame* %FramePtr, i32 0, i32 7, !dbg !651 In other words, the value of %i is stored on the frame object, on the heap, at an offset of 7 into the frame. I'm beginning to think a fundamental fix for this issue would be to stop replacing llvm.dbg.declare with llvm.dbg.value, and instead replace the llvm.dbg.declare with llvm.dbg.addr that points the debugger to the %i variable's new permanent location as an offset into the coroutine frame object. Does this approach make sense to people on this mailing list, who probably know more about how these intrinsics work than I do?>> If whatever's producing the coroutine IR has guarantees about where >> and when variables are loaded/stored from/to memory, it should be >> possible to put more information into the IR, so that the rest of LLVM >> doesn't have to guess. For example, this portion of IR: >> >> %15 = load i32, i32* %i.reload.addr62, align 4, !dbg !670 >> call void @llvm.dbg.value(metadata i32 %15, metadata !659, metadata >> !DIExpression()), !dbg !661 >> %inc19 = add nsw i32 %15, 1, !dbg !670 >> call void @llvm.dbg.value(metadata i32 %inc19, metadata !659, >> metadata !DIExpression()), !dbg !661 >> store i32 %inc19, i32* %i.reload.addr62, align 4, !dbg !670 >> >> Could have a call to llvm.dbg.addr(metadata i32 *%i.reload.addr66, >> ...) inserted after the store, indicating that the variable is located >> in memory.I tried multiple approaches to manually inserting an llvm.dbg.addr after the store instruction, as per your suggestion, Jeremy. I used llc to compile the IR into an object file that I then linked, and inspected the DWARF generated for the file. Unfortunately, inserting dbg.addr that operated on the reloaded values didn't lead to any change in the DWARF that was produced -- specifically, this didn't make a difference: call void @llvm.dbg.addr(metadata i32* %i.reload.addr62, metadata !873, metadata !DIExpression()), !dbg !884 I also tried adding a dbg.addr that attempted to point the debugger to the %i variable's location at its offset on the coroutine frame: call void @llvm.dbg.addr(metadata %_Z3foov.Frame* %FramePtr, metadata !873, metadata !DIExpression(DW_OP_plus, 28)), !dbg !884 This changed the live ranges for %i in the DWARF that was output, but not in a way that made the %i variable visible to the debugger. Again, I wonder if maybe the correct path forward here is to, instead of attempting to expand the live ranges in the DWARF for %i, to instead signal in the debug info that the value of %i is always available to read from the 7 offset into the coroutine frame. If this makes sense, then I need to learn how to convey that via the llvm.dbg intrinsics -- llvm.dbg.addr sounds like what I want, doesn't it? Again, thanks for all the help on this list. (PS: I've also been enjoying reading the proposals you sent, Jeremy, for post-ISel debug info salvaging!) Please let me know if my thinking above sounds right or misguided in any way -- thanks! - Brian Gesiak
Jeremy Morse via llvm-dev
2020-Feb-26 16:01 UTC
[llvm-dev] Why is lldb telling me "variable not available"?
Hi Brian, On Tue, Feb 25, 2020 at 7:43 PM Brian Gesiak <modocache at gmail.com> wrote:> In other words, the value of %i is stored on the frame object, on the > heap, at an offset of 7 into the frame. I'm beginning to think a > fundamental fix for this issue would be to stop replacing > llvm.dbg.declare with llvm.dbg.value, and instead replace the > llvm.dbg.declare with llvm.dbg.addr that points the debugger to the %i > variable's new permanent location as an offset into the coroutine > frame object. Does this approach make sense to people on this mailing > list, who probably know more about how these intrinsics work than I > do?This matches a few similar use cases that I'm aware of -- certain kinds of struct that are passed-by-value according to the language, but passed-by-reference according to ABI, are treated in that way. In general, the downside is that the debugger can only observe variable values when they get written to memory, not when they're computed, as dbg.values and dbg.declares aren't supposed to be mixed. Observing variable values slightly later might be an improvement over the current situation. Although, I don't think this will work immediately, see below,> I tried multiple approaches to manually inserting an llvm.dbg.addr > after the store instruction, as per your suggestion, Jeremy. I used > llc to compile the IR into an object file that I then linked, and > inspected the DWARF generated for the file. Unfortunately, inserting > dbg.addr that operated on the reloaded values didn't lead to any > change in the DWARF that was produced -- specifically, this didn't > make a difference: > > call void @llvm.dbg.addr(metadata i32* %i.reload.addr62, metadata > !873, metadata !DIExpression()), !dbg !884Ouch, I tried this myself, and ran into the same difficulty. I'd missed that all your functions are marked "optnone" / -O0, which means a different instruction-selection pass (FastISel) runs, and it turns out FastISel isn't aware of dbg.addrs existence. Even better, FastISel doesn't manage to lower any debug intrinsic (including dbg.declare) that refers to a GEP, because it doesn't have a register location (the GEP gets folded into a memory addressing mode). I've hacked together some support in [0], that allows dbg.addr's of GEPs to be handled. A single dbg.addr at the start of the function (and no dbg.values) should get you the same behaviour as a dbg.declare. I suspect the reason why this problem hasn't shown up in the past is because the coroutine code being generated hits a gap between "optimised" and "not optimised": I believe all variables in code that isn't optimised get their own storage (and so will always have a stack or register location). Wheras in the coroutine code you're generating the variable address doesn't get storage. If [0] is useful for you I can get that landed; it'd be good to hear whether this resolves the dbg.addr intrinsics not having an affect on the output. [0] https://github.com/jmorse/llvm-project/commit/40927e6c2b71ec914d937287a0c2ca6c52c01f6b -- Thanks, Jeremy