Brian Gesiak via llvm-dev
2020-Feb-06 13:19 UTC
[llvm-dev] Why is lldb telling me "variable not available"?
Hi all, I’m working on improving the debugging experience for C++20 coroutines when compiled with LLVM/Clang, and I could use some help from someone who understands debug information and DWARF (knowledge of coroutines isn't necessary, I don't think). Specifically, I’m trying to improve lldb’s behavior when showing variables in the current stack frame, when that frame corresponds to a coroutine function. To illustrate my problem, I uploaded C++ source, LLVM IR, a DWARF dump, and a shellscript demonstrating the invocations I used to create each of these, as a gist on GitHub: https://gist.github.com/modocache/670bc38e5a5ea2e0a3d6bafe8ea9c693 I'm looking at lldb's behavior on lines 23-40 of the C++ program, https://gist.github.com/modocache/670bc38e5a5ea2e0a3d6bafe8ea9c693#file-test-cpp-L23-L40, which I’ll paste below: ``` coro foo() { int i = 0; ++i; printf("%d\n", i); // 1 // Breakpoint 1: // (lldb) frame variable i // (int) i = 1 co_await suspend_always(); int j = 0; ++i; ++j; printf("%d, %d\n", i, j); // 2, 1 // Breakpoint 2: // (lldb) frame variable i // (int) i = <variable not available> // (lldb) frame variable j // (int) j = 1 ``` Here 'foo' is a coroutine, and the comments denote commands executed at the lldb prompt when stopped at a breakpoint placed at the 'printf' above. At breakpoint 1, lldb correctly shows the value of 'i' to be 1. At breakpoint 2, 'i' is shown as 'variable not available'. ('j', however, is shown correctly.) Looking at the LLVM IR debug info metadata, (and keeping in mind I'm no expert at this stuff) I don't see anything out of the ordinary. Coroutine passes outline sections of the coroutine function, and to maintain state, they replace references to stack frame variables with loads and stores onto a "coroutine frame" object. But, looking at the IR in test.ll lines 221-238, the 'llvm.dbg.value' intrinsic is being used to denote the location of the values, which I think should allow lldb to print the correct values: https://gist.github.com/modocache/670bc38e5a5ea2e0a3d6bafe8ea9c693#file-test-ll-L221-L238 The 'llvm.dbg.value' intrinsics reference metadata in slots !659 and !668, and these seem correct to me as well: ``` !659 = !DILocalVariable(name: "i", scope: !660, file: !5, line: 24, type: !130) !668 = !DILocalVariable(name: "j", scope: !660, file: !5, line: 32, type: !130) ``` Finally, I looked at the DWARF being produced for the program. When broken at breakpoint two above, executing 'disassemble' at the lldb prompt shows me the program counters for the region I'm interested in: ``` 0x401885 <+373>: movq -0x8(%rbp), %rax 0x401889 <+377>: movl $0x0, 0x40(%rax) 0x401890 <+384>: movl 0x28(%rax), %edx 0x401893 <+387>: addl $0x1, %edx 0x401896 <+390>: movl %edx, 0x28(%rax) 0x401899 <+393>: movl 0x40(%rax), %edx 0x40189c <+396>: addl $0x1, %edx 0x40189f <+399>: movl %edx, 0x40(%rax) -> 0x4018a2 <+402>: movl 0x28(%rax), %esi ``` Specifically, I think 0x401893 and 0x40189c respectively show 'i' and 'j' being incremented by 1. Now, looking at the DWARF dump, I can see that 'i' is live between 0x401893 and 0x40189c, and 'j' is live between 0x40189c and 0x4018b9: https://gist.github.com/modocache/670bc38e5a5ea2e0a3d6bafe8ea9c693#file-test-dwarfdump-txt-L4063-L4086 Sure enough, when I break in lldb within 'i' region 0x401893 and 0x40189c, 'frame variable i' succeeds. But outside of that region, lldb tells me '(int) i = <variable not available>'. How do I improve the debug info here such that lldb can show 'i' outside of that small live range? Any and all help, questions, comments, are very much appreciated! - Brian Gesiak
Jeremy Morse via llvm-dev
2020-Feb-06 16:04 UTC
[llvm-dev] Why is lldb telling me "variable not available"?
Hi Brian, Thanks for working on coroutines, the debugging experience, and in particular thanks for the comprehensive write-up!, On Thu, Feb 6, 2020 at 1:19 PM Brian Gesiak via llvm-dev <llvm-dev at lists.llvm.org> wrote:> Specifically, I’m trying to improve lldb’s behavior when showing > variables in the current stack frame, when that frame corresponds to a > coroutine function.[...] Everything in the IR appears correct to my eyes, although I know next to nothing about coroutines and might have missed something. The simplest explanation of why the variable location goes missing can be seen in the disassembly:> ``` > 0x401885 <+373>: movq -0x8(%rbp), %rax > 0x401889 <+377>: movl $0x0, 0x40(%rax) > 0x401890 <+384>: X movl 0x28(%rax), %edx > 0x401893 <+387>: X addl $0x1, %edx > 0x401896 <+390>: X movl %edx, 0x28(%rax) > 0x401899 <+393>: X movl 0x40(%rax), %edx > 0x40189c <+396>: addl $0x1, %edx > 0x40189f <+399>: movl %edx, 0x40(%rax) > -> 0x4018a2 <+402>: movl 0x28(%rax), %esi > ```Where I've marked with 'X' before the mnemonic the instructions that the variable location list covers. The location of "i" is correctly given as edx from its load to its store, and ends when edx is overwritten with the value of "j". In all the rest of the code, the variables value is in memory, and the DWARF data doesn't record this. Ideally debug info would track variables when they're stored to memory -- however we don't automatically know whether any subsequent store to memory will overwrite that variable, and so we don't track locations into memory. PR40628 [0] is an example of what can go wrong, where we described a variable as being in memory, but didn't know when that location was overwritten. If whatever's producing the coroutine IR has guarantees about where and when variables are loaded/stored from/to memory, it should be possible to put more information into the IR, so that the rest of LLVM doesn't have to guess. For example, this portion of IR: %15 = load i32, i32* %i.reload.addr62, align 4, !dbg !670 call void @llvm.dbg.value(metadata i32 %15, metadata !659, metadata !DIExpression()), !dbg !661 %inc19 = add nsw i32 %15, 1, !dbg !670 call void @llvm.dbg.value(metadata i32 %inc19, metadata !659, metadata !DIExpression()), !dbg !661 store i32 %inc19, i32* %i.reload.addr62, align 4, !dbg !670 Could have a call to llvm.dbg.addr(metadata i32 *%i.reload.addr66, ...) inserted after the store, indicating that the variable is located in memory. This should work (TM) so long as that memory is never overwritten with something that isn't the current value of "i" on every path after the call to llvm.dbg.addr; and on every path after the call to llvm.dbg.addr, when the variable is loaded form memory, there's a call to llvm.dbg.value to indicate that the variable is located somewhere other than memory now. Providing that extra information should improve the location coverage for your example, certainly when unoptimised. However, I believe (80%) this method isn't safe against optimisation, because (for example) dead stores can be deleted by LLVM passes without deleting the call to llvm.dbg.addr, pointing the variable location at a stale value in memory. Unfortunately I'm not aware of a facility or technique that protects against this right now. (CC Reid who I think ran into this last?). Note that there's some support for tracking variables through stack spills in post-isel debug data passes, however those loads and stores operate in well defined ways, and general loads and stores might not. [0] https://bugs.llvm.org/show_bug.cgi?id=40628 -- Thanks, Jeremy
Brian Gesiak via llvm-dev
2020-Feb-12 05:22 UTC
[llvm-dev] Why is lldb telling me "variable not available"?
Apologies for the slow response here Jeremy. Your reply has been incredibly helpful so far, I just need to try adding 'llvm.dbg.addr' myself to confirm that works. Thank you! - Brian Gesiak On Thu, Feb 6, 2020 at 11:04 AM Jeremy Morse <jeremy.morse.llvm at gmail.com> wrote:> Hi Brian, > > Thanks for working on coroutines, the debugging experience, and in > particular thanks for the comprehensive write-up!, > > On Thu, Feb 6, 2020 at 1:19 PM Brian Gesiak via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > Specifically, I’m trying to improve lldb’s behavior when showing > > variables in the current stack frame, when that frame corresponds to a > > coroutine function. > > [...] > > Everything in the IR appears correct to my eyes, although I know next > to nothing about coroutines and might have missed something. The > simplest explanation of why the variable location goes missing can be > seen in the disassembly: > > > ``` > > 0x401885 <+373>: movq -0x8(%rbp), %rax > > 0x401889 <+377>: movl $0x0, 0x40(%rax) > > 0x401890 <+384>: X movl 0x28(%rax), %edx > > 0x401893 <+387>: X addl $0x1, %edx > > 0x401896 <+390>: X movl %edx, 0x28(%rax) > > 0x401899 <+393>: X movl 0x40(%rax), %edx > > 0x40189c <+396>: addl $0x1, %edx > > 0x40189f <+399>: movl %edx, 0x40(%rax) > > -> 0x4018a2 <+402>: movl 0x28(%rax), %esi > > ``` > > Where I've marked with 'X' before the mnemonic the instructions that > the variable location list covers. The location of "i" is correctly > given as edx from its load to its store, and ends when edx is > overwritten with the value of "j". In all the rest of the code, the > variables value is in memory, and the DWARF data doesn't record this. > > Ideally debug info would track variables when they're stored to memory > -- however we don't automatically know whether any subsequent store to > memory will overwrite that variable, and so we don't track locations > into memory. PR40628 [0] is an example of what can go wrong, where we > described a variable as being in memory, but didn't know when that > location was overwritten. > > If whatever's producing the coroutine IR has guarantees about where > and when variables are loaded/stored from/to memory, it should be > possible to put more information into the IR, so that the rest of LLVM > doesn't have to guess. For example, this portion of IR: > > %15 = load i32, i32* %i.reload.addr62, align 4, !dbg !670 > call void @llvm.dbg.value(metadata i32 %15, metadata !659, metadata > !DIExpression()), !dbg !661 > %inc19 = add nsw i32 %15, 1, !dbg !670 > call void @llvm.dbg.value(metadata i32 %inc19, metadata !659, > metadata !DIExpression()), !dbg !661 > store i32 %inc19, i32* %i.reload.addr62, align 4, !dbg !670 > > Could have a call to llvm.dbg.addr(metadata i32 *%i.reload.addr66, > ...) inserted after the store, indicating that the variable is located > in memory. This should work (TM) so long as that memory is never > overwritten with something that isn't the current value of "i" on > every path after the call to llvm.dbg.addr; and on every path after > the call to llvm.dbg.addr, when the variable is loaded form memory, > there's a call to llvm.dbg.value to indicate that the variable is > located somewhere other than memory now. > > Providing that extra information should improve the location coverage > for your example, certainly when unoptimised. However, I believe (80%) > this method isn't safe against optimisation, because (for example) > dead stores can be deleted by LLVM passes without deleting the call to > llvm.dbg.addr, pointing the variable location at a stale value in > memory. Unfortunately I'm not aware of a facility or technique that > protects against this right now. (CC Reid who I think ran into this > last?). > > Note that there's some support for tracking variables through stack > spills in post-isel debug data passes, however those loads and stores > operate in well defined ways, and general loads and stores might not. > > [0] https://bugs.llvm.org/show_bug.cgi?id=40628 > > -- > Thanks, > Jeremy >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200212/9b4054a6/attachment.html>
Possibly Parallel Threads
- Why is lldb telling me "variable not available"?
- Why is lldb telling me "variable not available"?
- Why is lldb telling me "variable not available"?
- Why is lldb telling me "variable not available"?
- Suggestions for how coroutines and UBSan codegen can play nice with one another?