Jeremy Morse via llvm-dev
2020-Feb-26  16:01 UTC
[llvm-dev] Why is lldb telling me "variable not available"?
Hi Brian, On Tue, Feb 25, 2020 at 7:43 PM Brian Gesiak <modocache at gmail.com> wrote:> In other words, the value of %i is stored on the frame object, on the > heap, at an offset of 7 into the frame. I'm beginning to think a > fundamental fix for this issue would be to stop replacing > llvm.dbg.declare with llvm.dbg.value, and instead replace the > llvm.dbg.declare with llvm.dbg.addr that points the debugger to the %i > variable's new permanent location as an offset into the coroutine > frame object. Does this approach make sense to people on this mailing > list, who probably know more about how these intrinsics work than I > do?This matches a few similar use cases that I'm aware of -- certain kinds of struct that are passed-by-value according to the language, but passed-by-reference according to ABI, are treated in that way. In general, the downside is that the debugger can only observe variable values when they get written to memory, not when they're computed, as dbg.values and dbg.declares aren't supposed to be mixed. Observing variable values slightly later might be an improvement over the current situation. Although, I don't think this will work immediately, see below,> I tried multiple approaches to manually inserting an llvm.dbg.addr > after the store instruction, as per your suggestion, Jeremy. I used > llc to compile the IR into an object file that I then linked, and > inspected the DWARF generated for the file. Unfortunately, inserting > dbg.addr that operated on the reloaded values didn't lead to any > change in the DWARF that was produced -- specifically, this didn't > make a difference: > > call void @llvm.dbg.addr(metadata i32* %i.reload.addr62, metadata > !873, metadata !DIExpression()), !dbg !884Ouch, I tried this myself, and ran into the same difficulty. I'd missed that all your functions are marked "optnone" / -O0, which means a different instruction-selection pass (FastISel) runs, and it turns out FastISel isn't aware of dbg.addrs existence. Even better, FastISel doesn't manage to lower any debug intrinsic (including dbg.declare) that refers to a GEP, because it doesn't have a register location (the GEP gets folded into a memory addressing mode). I've hacked together some support in [0], that allows dbg.addr's of GEPs to be handled. A single dbg.addr at the start of the function (and no dbg.values) should get you the same behaviour as a dbg.declare. I suspect the reason why this problem hasn't shown up in the past is because the coroutine code being generated hits a gap between "optimised" and "not optimised": I believe all variables in code that isn't optimised get their own storage (and so will always have a stack or register location). Wheras in the coroutine code you're generating the variable address doesn't get storage. If [0] is useful for you I can get that landed; it'd be good to hear whether this resolves the dbg.addr intrinsics not having an affect on the output. [0] https://github.com/jmorse/llvm-project/commit/40927e6c2b71ec914d937287a0c2ca6c52c01f6b -- Thanks, Jeremy
Brian Gesiak via llvm-dev
2020-Feb-26  21:32 UTC
[llvm-dev] Why is lldb telling me "variable not available"?
Awesome, thanks Jeremy. On Wed, Feb 26, 2020 at 11:02 AM Jeremy Morse <jeremy.morse.llvm at gmail.com> wrote:> > Hi Brian, > > On Tue, Feb 25, 2020 at 7:43 PM Brian Gesiak <modocache at gmail.com> wrote: > > In other words, the value of %i is stored on the frame object, on the > > heap, at an offset of 7 into the frame. I'm beginning to think a > > fundamental fix for this issue would be to stop replacing > > llvm.dbg.declare with llvm.dbg.value, and instead replace the > > llvm.dbg.declare with llvm.dbg.addr that points the debugger to the %i > > variable's new permanent location as an offset into the coroutine > > frame object. Does this approach make sense to people on this mailing > > list, who probably know more about how these intrinsics work than I > > do? > > This matches a few similar use cases that I'm aware of -- certain > kinds of struct that are passed-by-value according to the language, > but passed-by-reference according to ABI, are treated in that way. In > general, the downside is that the debugger can only observe variable > values when they get written to memory, not when they're computed, as > dbg.values and dbg.declares aren't supposed to be mixed. Observing > variable values slightly later might be an improvement over the > current situation.This is very reassuring, thank you!> Although, I don't think this will work immediately, see below, > > > I tried multiple approaches to manually inserting an llvm.dbg.addr > > after the store instruction, as per your suggestion, Jeremy. I used > > llc to compile the IR into an object file that I then linked, and > > inspected the DWARF generated for the file. Unfortunately, inserting > > dbg.addr that operated on the reloaded values didn't lead to any > > change in the DWARF that was produced -- specifically, this didn't > > make a difference: > > > > call void @llvm.dbg.addr(metadata i32* %i.reload.addr62, metadata > > !873, metadata !DIExpression()), !dbg !884 > > Ouch, I tried this myself, and ran into the same difficulty. I'd > missed that all your functions are marked "optnone" / -O0, which means > a different instruction-selection pass (FastISel) runs, and it turns > out FastISel isn't aware of dbg.addrs existence. Even better, FastISel > doesn't manage to lower any debug intrinsic (including dbg.declare) > that refers to a GEP, because it doesn't have a register location (the > GEP gets folded into a memory addressing mode). > > I've hacked together some support in [0], that allows dbg.addr's of > GEPs to be handled. A single dbg.addr at the start of the function > (and no dbg.values) should get you the same behaviour as a > dbg.declare.Maybe I'm running into the limitations of how far I can get here by hacking up the LLVM IR directly, but I've made a couple of tries here without any luck. I put the IR files and the DWARF dumps of executables built from that IR in another GitHub gist, https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275. Here's a summary of the attempts I've made -- all of them with your patch included in my llvm-project tree: 1. Don't lower llvm.dbg.declare to llvm.dbg.value at all, no other changes -- doesn't work, in the way I expected it to not work I tried removing the call to llvm::LowerDbgDeclare that's being made by the coroutine passes. The IR and DWARF dump after having done so are in the files 'repro.declare.ll' and 'repro.declare.dwarf.txt'. Using an executable built from this IR, I can use lldb to break at the line where the declare exists, https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275#file-repro-declare-ll-L167, using the command 'b repro.cpp:24'. Running 'e i' at this point prints an incorrect value, '(int) $0 = 24742', but at least 'i' is found. Then, continuing past a coroutine "suspend point" (a point after which the value of 'i' is stored into an offset of the coroutine frame object, an outlined coroutine function 'foo.resume' is invoked, 'i' is loaded out of an offset on the coroutine frame object, and normally a llvm.dbg.value call would have been generated for that load), and then breaking at 'b repro.cpp:32', executing 'e i' results in 'error: <user expression 1>:1:1: use of undeclared identifier 'i''. This fits my mental model, I think: 'i' was declared in the function 'foo', and its DWARF information https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275#file-repro-declare-dwarf-txt-L3711 only specifies its offset from the 'foo' frame. After the suspend point, lldb is stopped in 'foo.resume', and the 'foo' frame is no longer active, so lldb determines 'i' is out of scope. Makes sense! 2. Don't lower llvm.dbg.declare at all, sed to automatically replace all llvm.dbg.declare with llvm.dbg.addr -- doesn't work, in an unexpected way The outputs from this are in the files repro.addr.ll and repro.addr.dwarf.txt. My understanding based on https://llvm.org/docs/SourceLevelDebugging.html#llvm-dbg-addr is that llvm.dbg.addr should work at least as well as llvm.dbg.declare does. However, when I use it and break at https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275#file-repro-addr-ll-L167 ('b repro.cpp:24'), executing 'e i' in lldb outputs 'error: <user expression 1>:1:1: use of undeclared identifier 'i''. This threw me for a loop -- maybe I'm misunderstanding the relationship between llvm.dbg.addr and llvm.dbg.declare? - Brian Gesiak> I suspect the reason why this problem hasn't shown up in > the past is because the coroutine code being generated hits a gap > between "optimised" and "not optimised": I believe all variables in > code that isn't optimised get their own storage (and so will always > have a stack or register location). Wheras in the coroutine code > you're generating the variable address doesn't get storage. > > If [0] is useful for you I can get that landed; it'd be good to hear > whether this resolves the dbg.addr intrinsics not having an affect on > the output. > > [0] https://github.com/jmorse/llvm-project/commit/40927e6c2b71ec914d937287a0c2ca6c52c01f6b > > -- > Thanks, > Jeremy
Vedant Kumar via llvm-dev
2020-Feb-26  22:08 UTC
[llvm-dev] Why is lldb telling me "variable not available"?
I haven't fully parsed this thread (sorry!), but I wanted to briefly mention that the SafeStack & ASan passes both do something similar (I think): move local variables backed by allocas onto a separate stack. These passes use replaceDbgDeclare to rewrite dbg.declares s.t. they point into the new stack. After that, llvm presumably runs LowerDbgDeclare (usually via InstCombine), but all the inserted dbg.values have useful locations relative to the new stack. It could be worth investigating whether replaceDbgDeclare is a good fit for the coro-split pass. vedant> On Feb 26, 2020, at 1:32 PM, Brian Gesiak via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Awesome, thanks Jeremy. > > On Wed, Feb 26, 2020 at 11:02 AM Jeremy Morse > <jeremy.morse.llvm at gmail.com <mailto:jeremy.morse.llvm at gmail.com>> wrote: >> >> Hi Brian, >> >> On Tue, Feb 25, 2020 at 7:43 PM Brian Gesiak <modocache at gmail.com> wrote: >>> In other words, the value of %i is stored on the frame object, on the >>> heap, at an offset of 7 into the frame. I'm beginning to think a >>> fundamental fix for this issue would be to stop replacing >>> llvm.dbg.declare with llvm.dbg.value, and instead replace the >>> llvm.dbg.declare with llvm.dbg.addr that points the debugger to the %i >>> variable's new permanent location as an offset into the coroutine >>> frame object. Does this approach make sense to people on this mailing >>> list, who probably know more about how these intrinsics work than I >>> do? >> >> This matches a few similar use cases that I'm aware of -- certain >> kinds of struct that are passed-by-value according to the language, >> but passed-by-reference according to ABI, are treated in that way. In >> general, the downside is that the debugger can only observe variable >> values when they get written to memory, not when they're computed, as >> dbg.values and dbg.declares aren't supposed to be mixed. Observing >> variable values slightly later might be an improvement over the >> current situation. > > This is very reassuring, thank you! > >> Although, I don't think this will work immediately, see below, >> >>> I tried multiple approaches to manually inserting an llvm.dbg.addr >>> after the store instruction, as per your suggestion, Jeremy. I used >>> llc to compile the IR into an object file that I then linked, and >>> inspected the DWARF generated for the file. Unfortunately, inserting >>> dbg.addr that operated on the reloaded values didn't lead to any >>> change in the DWARF that was produced -- specifically, this didn't >>> make a difference: >>> >>> call void @llvm.dbg.addr(metadata i32* %i.reload.addr62, metadata >>> !873, metadata !DIExpression()), !dbg !884 >> >> Ouch, I tried this myself, and ran into the same difficulty. I'd >> missed that all your functions are marked "optnone" / -O0, which means >> a different instruction-selection pass (FastISel) runs, and it turns >> out FastISel isn't aware of dbg.addrs existence. Even better, FastISel >> doesn't manage to lower any debug intrinsic (including dbg.declare) >> that refers to a GEP, because it doesn't have a register location (the >> GEP gets folded into a memory addressing mode). >> >> I've hacked together some support in [0], that allows dbg.addr's of >> GEPs to be handled. A single dbg.addr at the start of the function >> (and no dbg.values) should get you the same behaviour as a >> dbg.declare. > > Maybe I'm running into the limitations of how far I can get here by > hacking up the LLVM IR directly, but I've made a couple of tries here > without any luck. I put the IR files and the DWARF dumps of > executables built from that IR in another GitHub gist, > https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275 <https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275>. > Here's a summary of the attempts I've made -- all of them with your > patch included in my llvm-project tree: > > 1. Don't lower llvm.dbg.declare to llvm.dbg.value at all, no other > changes -- doesn't work, in the way I expected it to not work > > I tried removing the call to llvm::LowerDbgDeclare that's being made > by the coroutine passes. The IR and DWARF dump after having done so > are in the files 'repro.declare.ll' and 'repro.declare.dwarf.txt'. > > Using an executable built from this IR, I can use lldb to break at the > line where the declare exists, > https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275#file-repro-declare-ll-L167 <https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275#file-repro-declare-ll-L167>, > using the command 'b repro.cpp:24'. Running 'e i' at this point prints > an incorrect value, '(int) $0 = 24742', but at least 'i' is found. > > Then, continuing past a coroutine "suspend point" (a point after which > the value of 'i' is stored into an offset of the coroutine frame > object, an outlined coroutine function 'foo.resume' is invoked, 'i' is > loaded out of an offset on the coroutine frame object, and normally a > llvm.dbg.value call would have been generated for that load), and then > breaking at 'b repro.cpp:32', executing 'e i' results in 'error: <user > expression 1>:1:1: use of undeclared identifier 'i''. This fits my > mental model, I think: 'i' was declared in the function 'foo', and its > DWARF information > https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275#file-repro-declare-dwarf-txt-L3711 <https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275#file-repro-declare-dwarf-txt-L3711> > only specifies its offset from the 'foo' frame. After the suspend > point, lldb is stopped in 'foo.resume', and the 'foo' frame is no > longer active, so lldb determines 'i' is out of scope. Makes sense! > > 2. Don't lower llvm.dbg.declare at all, sed to automatically replace > all llvm.dbg.declare with llvm.dbg.addr -- doesn't work, in an > unexpected way > > The outputs from this are in the files repro.addr.ll and > repro.addr.dwarf.txt. My understanding based on > https://llvm.org/docs/SourceLevelDebugging.html#llvm-dbg-addr <https://llvm.org/docs/SourceLevelDebugging.html#llvm-dbg-addr> is that > llvm.dbg.addr should work at least as well as llvm.dbg.declare does. > However, when I use it and break at > https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275#file-repro-addr-ll-L167 <https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275#file-repro-addr-ll-L167> > ('b repro.cpp:24'), executing 'e i' in lldb outputs 'error: <user > expression 1>:1:1: use of undeclared identifier 'i''. This threw me > for a loop -- maybe I'm misunderstanding the relationship between > llvm.dbg.addr and llvm.dbg.declare? > > - Brian Gesiak > >> I suspect the reason why this problem hasn't shown up in >> the past is because the coroutine code being generated hits a gap >> between "optimised" and "not optimised": I believe all variables in >> code that isn't optimised get their own storage (and so will always >> have a stack or register location). Wheras in the coroutine code >> you're generating the variable address doesn't get storage. >> >> If [0] is useful for you I can get that landed; it'd be good to hear >> whether this resolves the dbg.addr intrinsics not having an affect on >> the output. >> >> [0] https://github.com/jmorse/llvm-project/commit/40927e6c2b71ec914d937287a0c2ca6c52c01f6b >> >> -- >> Thanks, >> Jeremy > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200226/7a94d202/attachment.html>