Brian Gesiak via llvm-dev
2020-Feb-26 23:29 UTC
[llvm-dev] Why is lldb telling me "variable not available"?
Vedant, Jeremy, Thanks a ton! I copied ASan's use of 'replaceDbgDeclare', think that worked! https://github.com/modocache/llvm-project/commit/afbc04e1dcba has some extremely quick and dirty changes I made (with no tests!), and a link to a Gist with the LLVM IR and DWARF produced, https://gist.github.com/modocache/6f29093ba2827946011b422ed3bd2903. There's only one kink: the spot I placed it has 'replaceDbgDeclare' called multiple times, and it reuses the result of a single getelementptr instruction. Because of this, only the last invocation results in correct debug info being produced. Put another way: when I place a breakpoint in my example on the last instance that %i is reloaded, repro.cpp:45 at https://gist.github.com/modocache/6f29093ba2827946011b422ed3bd2903#file-repro-cpp-L45, I get the correct debug info: i = 3, j = 2. In every other spot, I get 'error: Couldn't materialize: couldn't get the value of variable i: variable not available'. Still, is great progress for me. I think to work around this error, I need to do 2 things: 1. Call replaceDbgDeclare only once. 2. Call it with the correct arguments -- you may have noticed in https://github.com/modocache/llvm-project/commit/afbc04e1dcba#diff-f1204f42e8cb9e3718892e63059f8347R743 that I'm applying an offset of 0, which doesn't make much sense. I need to figure out how to use this function such that it encodes in DWARF to load at an offset from the coroutine frame, just as a getelementptr would. - Brian On Wed, Feb 26, 2020 at 5:15 PM Brian Gesiak <modocache at gmail.com> wrote:> > Vedant, thank you! I had meant to ask if any of this reminded you all > of something else that I could emulate. I'll look into uses of > 'replaceDbgDeclare' in SafeStack/ASan. - Brian > > On Wed, Feb 26, 2020 at 5:08 PM Vedant Kumar <vedant_kumar at apple.com> wrote: > > > > I haven't fully parsed this thread (sorry!), but I wanted to briefly mention that the SafeStack & ASan passes both do something similar (I think): move local variables backed by allocas onto a separate stack. These passes use replaceDbgDeclare to rewrite dbg.declares s.t. they point into the new stack. After that, llvm presumably runs LowerDbgDeclare (usually via InstCombine), but all the inserted dbg.values have useful locations relative to the new stack. > > > > It could be worth investigating whether replaceDbgDeclare is a good fit for the coro-split pass. > > > > vedant > > > > On Feb 26, 2020, at 1:32 PM, Brian Gesiak via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > > Awesome, thanks Jeremy. > > > > On Wed, Feb 26, 2020 at 11:02 AM Jeremy Morse > > <jeremy.morse.llvm at gmail.com> wrote: > > > > > > Hi Brian, > > > > On Tue, Feb 25, 2020 at 7:43 PM Brian Gesiak <modocache at gmail.com> wrote: > > > > In other words, the value of %i is stored on the frame object, on the > > heap, at an offset of 7 into the frame. I'm beginning to think a > > fundamental fix for this issue would be to stop replacing > > llvm.dbg.declare with llvm.dbg.value, and instead replace the > > llvm.dbg.declare with llvm.dbg.addr that points the debugger to the %i > > variable's new permanent location as an offset into the coroutine > > frame object. Does this approach make sense to people on this mailing > > list, who probably know more about how these intrinsics work than I > > do? > > > > > > This matches a few similar use cases that I'm aware of -- certain > > kinds of struct that are passed-by-value according to the language, > > but passed-by-reference according to ABI, are treated in that way. In > > general, the downside is that the debugger can only observe variable > > values when they get written to memory, not when they're computed, as > > dbg.values and dbg.declares aren't supposed to be mixed. Observing > > variable values slightly later might be an improvement over the > > current situation. > > > > > > This is very reassuring, thank you! > > > > Although, I don't think this will work immediately, see below, > > > > I tried multiple approaches to manually inserting an llvm.dbg.addr > > after the store instruction, as per your suggestion, Jeremy. I used > > llc to compile the IR into an object file that I then linked, and > > inspected the DWARF generated for the file. Unfortunately, inserting > > dbg.addr that operated on the reloaded values didn't lead to any > > change in the DWARF that was produced -- specifically, this didn't > > make a difference: > > > > call void @llvm.dbg.addr(metadata i32* %i.reload.addr62, metadata > > !873, metadata !DIExpression()), !dbg !884 > > > > > > Ouch, I tried this myself, and ran into the same difficulty. I'd > > missed that all your functions are marked "optnone" / -O0, which means > > a different instruction-selection pass (FastISel) runs, and it turns > > out FastISel isn't aware of dbg.addrs existence. Even better, FastISel > > doesn't manage to lower any debug intrinsic (including dbg.declare) > > that refers to a GEP, because it doesn't have a register location (the > > GEP gets folded into a memory addressing mode). > > > > I've hacked together some support in [0], that allows dbg.addr's of > > GEPs to be handled. A single dbg.addr at the start of the function > > (and no dbg.values) should get you the same behaviour as a > > dbg.declare. > > > > > > Maybe I'm running into the limitations of how far I can get here by > > hacking up the LLVM IR directly, but I've made a couple of tries here > > without any luck. I put the IR files and the DWARF dumps of > > executables built from that IR in another GitHub gist, > > https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275. > > Here's a summary of the attempts I've made -- all of them with your > > patch included in my llvm-project tree: > > > > 1. Don't lower llvm.dbg.declare to llvm.dbg.value at all, no other > > changes -- doesn't work, in the way I expected it to not work > > > > I tried removing the call to llvm::LowerDbgDeclare that's being made > > by the coroutine passes. The IR and DWARF dump after having done so > > are in the files 'repro.declare.ll' and 'repro.declare.dwarf.txt'. > > > > Using an executable built from this IR, I can use lldb to break at the > > line where the declare exists, > > https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275#file-repro-declare-ll-L167, > > using the command 'b repro.cpp:24'. Running 'e i' at this point prints > > an incorrect value, '(int) $0 = 24742', but at least 'i' is found. > > > > Then, continuing past a coroutine "suspend point" (a point after which > > the value of 'i' is stored into an offset of the coroutine frame > > object, an outlined coroutine function 'foo.resume' is invoked, 'i' is > > loaded out of an offset on the coroutine frame object, and normally a > > llvm.dbg.value call would have been generated for that load), and then > > breaking at 'b repro.cpp:32', executing 'e i' results in 'error: <user > > expression 1>:1:1: use of undeclared identifier 'i''. This fits my > > mental model, I think: 'i' was declared in the function 'foo', and its > > DWARF information > > https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275#file-repro-declare-dwarf-txt-L3711 > > only specifies its offset from the 'foo' frame. After the suspend > > point, lldb is stopped in 'foo.resume', and the 'foo' frame is no > > longer active, so lldb determines 'i' is out of scope. Makes sense! > > > > 2. Don't lower llvm.dbg.declare at all, sed to automatically replace > > all llvm.dbg.declare with llvm.dbg.addr -- doesn't work, in an > > unexpected way > > > > The outputs from this are in the files repro.addr.ll and > > repro.addr.dwarf.txt. My understanding based on > > https://llvm.org/docs/SourceLevelDebugging.html#llvm-dbg-addr is that > > llvm.dbg.addr should work at least as well as llvm.dbg.declare does. > > However, when I use it and break at > > https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275#file-repro-addr-ll-L167 > > ('b repro.cpp:24'), executing 'e i' in lldb outputs 'error: <user > > expression 1>:1:1: use of undeclared identifier 'i''. This threw me > > for a loop -- maybe I'm misunderstanding the relationship between > > llvm.dbg.addr and llvm.dbg.declare? > > > > - Brian Gesiak > > > > I suspect the reason why this problem hasn't shown up in > > the past is because the coroutine code being generated hits a gap > > between "optimised" and "not optimised": I believe all variables in > > code that isn't optimised get their own storage (and so will always > > have a stack or register location). Wheras in the coroutine code > > you're generating the variable address doesn't get storage. > > > > If [0] is useful for you I can get that landed; it'd be good to hear > > whether this resolves the dbg.addr intrinsics not having an affect on > > the output. > > > > [0] https://github.com/jmorse/llvm-project/commit/40927e6c2b71ec914d937287a0c2ca6c52c01f6b > > > > -- > > Thanks, > > Jeremy > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > >
Vedant Kumar via llvm-dev
2020-Feb-27 00:20 UTC
[llvm-dev] [lldb-dev] Why is lldb telling me "variable not available"?
> On Feb 26, 2020, at 3:29 PM, Brian Gesiak via lldb-dev <lldb-dev at lists.llvm.org> wrote: > > Vedant, Jeremy, > > Thanks a ton! I copied ASan's use of 'replaceDbgDeclare', think that worked! > > https://github.com/modocache/llvm-project/commit/afbc04e1dcba has some > extremely quick and dirty changes I made (with no tests!), and a link > to a Gist with the LLVM IR and DWARF produced, > https://gist.github.com/modocache/6f29093ba2827946011b422ed3bd2903. > > There's only one kink: the spot I placed it has 'replaceDbgDeclare' > called multiple times, and it reuses the result of a single > getelementptr instruction. Because of this, only the last invocation > results in correct debug info being produced. Put another way: when I > place a breakpoint in my example on the last instance that %i is > reloaded, repro.cpp:45 at > https://gist.github.com/modocache/6f29093ba2827946011b422ed3bd2903#file-repro-cpp-L45, > I get the correct debug info: i = 3, j = 2. In every other spot, I get > 'error: Couldn't materialize: couldn't get the value of variable i: > variable not available'. Still, is great progress for me. > > I think to work around this error, I need to do 2 things: > > 1. Call replaceDbgDeclare only once. > 2. Call it with the correct arguments -- you may have noticed in > https://github.com/modocache/llvm-project/commit/afbc04e1dcba#diff-f1204f42e8cb9e3718892e63059f8347R743 > that I'm applying an offset of 0, which doesn't make much sense. I > need to figure out how to use this function such that it encodes in > DWARF to load at an offset from the coroutine frame, just as a > getelementptr would.Haven't looked at this closely, but (2) sounds right, the goal is to let the compiler know that variables which used to live in one memory location have moved to another. If you have the precise address of the variable's new location within the 'coro' frame, you don't need to specify the offset parameter to replaceDbgDeclare ("NewAddress" is exact). More typically, instrumentation passes maintain a pointer to the new frame and a table of offsets for various variables, so the common pattern is 'replaceDbgDeclare(oldAllocaOrGEPForVar, newFramePtr, ..., offsetIntoNewFramePtrForVar)'. vedant> > - Brian > > On Wed, Feb 26, 2020 at 5:15 PM Brian Gesiak <modocache at gmail.com> wrote: >> >> Vedant, thank you! I had meant to ask if any of this reminded you all >> of something else that I could emulate. I'll look into uses of >> 'replaceDbgDeclare' in SafeStack/ASan. - Brian >> >> On Wed, Feb 26, 2020 at 5:08 PM Vedant Kumar <vedant_kumar at apple.com> wrote: >>> >>> I haven't fully parsed this thread (sorry!), but I wanted to briefly mention that the SafeStack & ASan passes both do something similar (I think): move local variables backed by allocas onto a separate stack. These passes use replaceDbgDeclare to rewrite dbg.declares s.t. they point into the new stack. After that, llvm presumably runs LowerDbgDeclare (usually via InstCombine), but all the inserted dbg.values have useful locations relative to the new stack. >>> >>> It could be worth investigating whether replaceDbgDeclare is a good fit for the coro-split pass. >>> >>> vedant >>> >>> On Feb 26, 2020, at 1:32 PM, Brian Gesiak via llvm-dev <llvm-dev at lists.llvm.org> wrote: >>> >>> Awesome, thanks Jeremy. >>> >>> On Wed, Feb 26, 2020 at 11:02 AM Jeremy Morse >>> <jeremy.morse.llvm at gmail.com> wrote: >>> >>> >>> Hi Brian, >>> >>> On Tue, Feb 25, 2020 at 7:43 PM Brian Gesiak <modocache at gmail.com> wrote: >>> >>> In other words, the value of %i is stored on the frame object, on the >>> heap, at an offset of 7 into the frame. I'm beginning to think a >>> fundamental fix for this issue would be to stop replacing >>> llvm.dbg.declare with llvm.dbg.value, and instead replace the >>> llvm.dbg.declare with llvm.dbg.addr that points the debugger to the %i >>> variable's new permanent location as an offset into the coroutine >>> frame object. Does this approach make sense to people on this mailing >>> list, who probably know more about how these intrinsics work than I >>> do? >>> >>> >>> This matches a few similar use cases that I'm aware of -- certain >>> kinds of struct that are passed-by-value according to the language, >>> but passed-by-reference according to ABI, are treated in that way. In >>> general, the downside is that the debugger can only observe variable >>> values when they get written to memory, not when they're computed, as >>> dbg.values and dbg.declares aren't supposed to be mixed. Observing >>> variable values slightly later might be an improvement over the >>> current situation. >>> >>> >>> This is very reassuring, thank you! >>> >>> Although, I don't think this will work immediately, see below, >>> >>> I tried multiple approaches to manually inserting an llvm.dbg.addr >>> after the store instruction, as per your suggestion, Jeremy. I used >>> llc to compile the IR into an object file that I then linked, and >>> inspected the DWARF generated for the file. Unfortunately, inserting >>> dbg.addr that operated on the reloaded values didn't lead to any >>> change in the DWARF that was produced -- specifically, this didn't >>> make a difference: >>> >>> call void @llvm.dbg.addr(metadata i32* %i.reload.addr62, metadata >>> !873, metadata !DIExpression()), !dbg !884 >>> >>> >>> Ouch, I tried this myself, and ran into the same difficulty. I'd >>> missed that all your functions are marked "optnone" / -O0, which means >>> a different instruction-selection pass (FastISel) runs, and it turns >>> out FastISel isn't aware of dbg.addrs existence. Even better, FastISel >>> doesn't manage to lower any debug intrinsic (including dbg.declare) >>> that refers to a GEP, because it doesn't have a register location (the >>> GEP gets folded into a memory addressing mode). >>> >>> I've hacked together some support in [0], that allows dbg.addr's of >>> GEPs to be handled. A single dbg.addr at the start of the function >>> (and no dbg.values) should get you the same behaviour as a >>> dbg.declare. >>> >>> >>> Maybe I'm running into the limitations of how far I can get here by >>> hacking up the LLVM IR directly, but I've made a couple of tries here >>> without any luck. I put the IR files and the DWARF dumps of >>> executables built from that IR in another GitHub gist, >>> https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275. >>> Here's a summary of the attempts I've made -- all of them with your >>> patch included in my llvm-project tree: >>> >>> 1. Don't lower llvm.dbg.declare to llvm.dbg.value at all, no other >>> changes -- doesn't work, in the way I expected it to not work >>> >>> I tried removing the call to llvm::LowerDbgDeclare that's being made >>> by the coroutine passes. The IR and DWARF dump after having done so >>> are in the files 'repro.declare.ll' and 'repro.declare.dwarf.txt'. >>> >>> Using an executable built from this IR, I can use lldb to break at the >>> line where the declare exists, >>> https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275#file-repro-declare-ll-L167, >>> using the command 'b repro.cpp:24'. Running 'e i' at this point prints >>> an incorrect value, '(int) $0 = 24742', but at least 'i' is found. >>> >>> Then, continuing past a coroutine "suspend point" (a point after which >>> the value of 'i' is stored into an offset of the coroutine frame >>> object, an outlined coroutine function 'foo.resume' is invoked, 'i' is >>> loaded out of an offset on the coroutine frame object, and normally a >>> llvm.dbg.value call would have been generated for that load), and then >>> breaking at 'b repro.cpp:32', executing 'e i' results in 'error: <user >>> expression 1>:1:1: use of undeclared identifier 'i''. This fits my >>> mental model, I think: 'i' was declared in the function 'foo', and its >>> DWARF information >>> https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275#file-repro-declare-dwarf-txt-L3711 >>> only specifies its offset from the 'foo' frame. After the suspend >>> point, lldb is stopped in 'foo.resume', and the 'foo' frame is no >>> longer active, so lldb determines 'i' is out of scope. Makes sense! >>> >>> 2. Don't lower llvm.dbg.declare at all, sed to automatically replace >>> all llvm.dbg.declare with llvm.dbg.addr -- doesn't work, in an >>> unexpected way >>> >>> The outputs from this are in the files repro.addr.ll and >>> repro.addr.dwarf.txt. My understanding based on >>> https://llvm.org/docs/SourceLevelDebugging.html#llvm-dbg-addr is that >>> llvm.dbg.addr should work at least as well as llvm.dbg.declare does. >>> However, when I use it and break at >>> https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275#file-repro-addr-ll-L167 >>> ('b repro.cpp:24'), executing 'e i' in lldb outputs 'error: <user >>> expression 1>:1:1: use of undeclared identifier 'i''. This threw me >>> for a loop -- maybe I'm misunderstanding the relationship between >>> llvm.dbg.addr and llvm.dbg.declare? >>> >>> - Brian Gesiak >>> >>> I suspect the reason why this problem hasn't shown up in >>> the past is because the coroutine code being generated hits a gap >>> between "optimised" and "not optimised": I believe all variables in >>> code that isn't optimised get their own storage (and so will always >>> have a stack or register location). Wheras in the coroutine code >>> you're generating the variable address doesn't get storage. >>> >>> If [0] is useful for you I can get that landed; it'd be good to hear >>> whether this resolves the dbg.addr intrinsics not having an affect on >>> the output. >>> >>> [0] https://github.com/jmorse/llvm-project/commit/40927e6c2b71ec914d937287a0c2ca6c52c01f6b >>> >>> -- >>> Thanks, >>> Jeremy >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>> > _______________________________________________ > lldb-dev mailing list > lldb-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
Brian Gesiak via llvm-dev
2020-Feb-28 12:08 UTC
[llvm-dev] [lldb-dev] Why is lldb telling me "variable not available"?
Jeremy, Vedant, thank you both for your help! I think I have something working pretty well: https://reviews.llvm.org/D75338 I'd greatly appreciate code review here -- in the course of writing the patch I definitely encountered gaps in my knowledge about debug info. In particular, I wasn't able to use the '-mllvm -use-dbg-addr' option successfully, and I don't quite understand why. Having read the docs in https://llvm.org/docs/SourceLevelDebugging.html, I'd expect the patch to work just as well whether using 'dbg.declare' or 'dbg.addr', but in practice when using 'dbg.addr' I couldn't find any variable entries in the DWARF that was output. Maybe I'm not using that option correctly? I hope to spend some more time soon learning more about debug info outside of this particular patch, so thanks again for all your help. - Brian Gesiak