thr3ads.net - llvm dev - [llvm-dev] Why is lldb telling me "variable not available"? [Feb 2020]

If this information is useful, please help other people find it:
Share via:

Jeremy Morse via llvm-dev

2020-Feb-26 16:01 UTC

[llvm-dev] Why is lldb telling me "variable not available"?

Hi Brian,

On Tue, Feb 25, 2020 at 7:43 PM Brian Gesiak <modocache at gmail.com>
wrote:> In other words, the value of %i is stored on the frame object, on the
> heap, at an offset of 7 into the frame. I'm beginning to think a
> fundamental fix for this issue would be to stop replacing
> llvm.dbg.declare with llvm.dbg.value, and instead replace the
> llvm.dbg.declare with llvm.dbg.addr that points the debugger to the %i
> variable's new permanent location as an offset into the coroutine
> frame object. Does this approach make sense to people on this mailing
> list, who probably know more about how these intrinsics work than I
> do?
This matches a few similar use cases that I'm aware of -- certain
kinds of struct that are passed-by-value according to the language,
but passed-by-reference according to ABI, are treated in that way. In
general, the downside is that the debugger can only observe variable
values when they get written to memory, not when they're computed, as
dbg.values and dbg.declares aren't supposed to be mixed. Observing
variable values slightly later might be an improvement over the
current situation.

Although, I don't think this will work immediately, see below,
> I tried multiple approaches to manually inserting an llvm.dbg.addr
> after the store instruction, as per your suggestion, Jeremy. I used
> llc to compile the IR into an object file that I then linked, and
> inspected the DWARF generated for the file. Unfortunately, inserting
> dbg.addr that operated on the reloaded values didn't lead to any
> change in the DWARF that was produced --  specifically, this didn't
> make a difference:
>
> call void @llvm.dbg.addr(metadata i32* %i.reload.addr62, metadata
> !873, metadata !DIExpression()), !dbg !884
Ouch, I tried this myself, and ran into the same difficulty. I'd
missed that all your functions are marked "optnone" / -O0, which means
a different instruction-selection pass (FastISel) runs, and it turns
out FastISel isn't aware of dbg.addrs existence. Even better, FastISel
doesn't manage to lower any debug intrinsic (including dbg.declare)
that refers to a GEP, because it doesn't have a register location (the
GEP gets folded into a memory addressing mode).

I've hacked together some support in [0], that allows dbg.addr's of
GEPs to be handled. A single dbg.addr at the start of the function
(and no dbg.values) should get you the same behaviour as a
dbg.declare. I suspect the reason why this problem hasn't shown up in
the past is because the coroutine code being generated hits a gap
between "optimised" and "not optimised": I believe all
variables in
code that isn't optimised get their own storage (and so will always
have a stack or register location). Wheras in the coroutine code
you're generating the variable address doesn't get storage.

If [0] is useful for you I can get that landed; it'd be good to hear
whether this resolves the dbg.addr intrinsics not having an affect on
the output.

[0]
https://github.com/jmorse/llvm-project/commit/40927e6c2b71ec914d937287a0c2ca6c52c01f6b

--
Thanks,
Jeremy

Brian Gesiak via llvm-dev

2020-Feb-26 21:32 UTC

head link

[llvm-dev] Why is lldb telling me "variable not available"?

Awesome, thanks Jeremy.

On Wed, Feb 26, 2020 at 11:02 AM Jeremy Morse
<jeremy.morse.llvm at gmail.com> wrote:>
> Hi Brian,
>
> On Tue, Feb 25, 2020 at 7:43 PM Brian Gesiak <modocache at gmail.com>
wrote:
> > In other words, the value of %i is stored on the frame object, on the
> > heap, at an offset of 7 into the frame. I'm beginning to think a
> > fundamental fix for this issue would be to stop replacing
> > llvm.dbg.declare with llvm.dbg.value, and instead replace the
> > llvm.dbg.declare with llvm.dbg.addr that points the debugger to the %i
> > variable's new permanent location as an offset into the coroutine
> > frame object. Does this approach make sense to people on this mailing
> > list, who probably know more about how these intrinsics work than I
> > do?
>
> This matches a few similar use cases that I'm aware of -- certain
> kinds of struct that are passed-by-value according to the language,
> but passed-by-reference according to ABI, are treated in that way. In
> general, the downside is that the debugger can only observe variable
> values when they get written to memory, not when they're computed, as
> dbg.values and dbg.declares aren't supposed to be mixed. Observing
> variable values slightly later might be an improvement over the
> current situation.
This is very reassuring, thank you!
> Although, I don't think this will work immediately, see below,
>
> > I tried multiple approaches to manually inserting an llvm.dbg.addr
> > after the store instruction, as per your suggestion, Jeremy. I used
> > llc to compile the IR into an object file that I then linked, and
> > inspected the DWARF generated for the file. Unfortunately, inserting
> > dbg.addr that operated on the reloaded values didn't lead to any
> > change in the DWARF that was produced --  specifically, this
didn't
> > make a difference:
> >
> > call void @llvm.dbg.addr(metadata i32* %i.reload.addr62, metadata
> > !873, metadata !DIExpression()), !dbg !884
>
> Ouch, I tried this myself, and ran into the same difficulty. I'd
> missed that all your functions are marked "optnone" / -O0, which
means
> a different instruction-selection pass (FastISel) runs, and it turns
> out FastISel isn't aware of dbg.addrs existence. Even better, FastISel
> doesn't manage to lower any debug intrinsic (including dbg.declare)
> that refers to a GEP, because it doesn't have a register location (the
> GEP gets folded into a memory addressing mode).
>
> I've hacked together some support in [0], that allows dbg.addr's of
> GEPs to be handled. A single dbg.addr at the start of the function
> (and no dbg.values) should get you the same behaviour as a
> dbg.declare.
Maybe I'm running into the limitations of how far I can get here by
hacking up the LLVM IR directly, but I've made a couple of tries here
without any luck. I put the IR files and the DWARF dumps of
executables built from that IR in another GitHub gist,
https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275.
Here's a summary of the attempts I've made -- all of them with your
patch included in my llvm-project tree:

1. Don't lower llvm.dbg.declare to llvm.dbg.value at all, no other
changes -- doesn't work, in the way I expected it to not work

I tried removing the call to llvm::LowerDbgDeclare that's being made
by the coroutine passes. The IR and DWARF dump after having done so
are in the files 'repro.declare.ll' and
'repro.declare.dwarf.txt'.

Using an executable built from this IR, I can use lldb to break at the
line where the declare exists,
https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275#file-repro-declare-ll-L167,
using the command 'b repro.cpp:24'. Running 'e i' at this point
prints
an incorrect value, '(int) $0 = 24742', but at least 'i' is
found.

Then, continuing past a coroutine "suspend point" (a point after which
the value of 'i' is stored into an offset of the coroutine frame
object, an outlined coroutine function 'foo.resume' is invoked,
'i' is
loaded out of an offset on the coroutine frame object, and normally a
llvm.dbg.value call would have been generated for that load), and then
breaking at 'b repro.cpp:32', executing 'e i' results in
'error: <user
expression 1>:1:1: use of undeclared identifier 'i''. This fits
my
mental model, I think: 'i' was declared in the function 'foo',
and its
DWARF information
https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275#file-repro-declare-dwarf-txt-L3711
only specifies its offset from the 'foo' frame. After the suspend
point, lldb is stopped in 'foo.resume', and the 'foo' frame is
no
longer active, so lldb determines 'i' is out of scope. Makes sense!

2. Don't lower llvm.dbg.declare at all, sed to automatically replace
all llvm.dbg.declare with llvm.dbg.addr -- doesn't work, in an
unexpected way

The outputs from this are in the files repro.addr.ll and
repro.addr.dwarf.txt. My understanding based on
https://llvm.org/docs/SourceLevelDebugging.html#llvm-dbg-addr is that
llvm.dbg.addr should work at least as well as llvm.dbg.declare does.
However, when I use it and break at
https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275#file-repro-addr-ll-L167
('b repro.cpp:24'), executing 'e i' in lldb outputs 'error:
<user
expression 1>:1:1: use of undeclared identifier 'i''. This threw
me
for a loop -- maybe I'm misunderstanding the relationship between
llvm.dbg.addr and llvm.dbg.declare?

- Brian Gesiak
> I suspect the reason why this problem hasn't shown up in
> the past is because the coroutine code being generated hits a gap
> between "optimised" and "not optimised": I believe all
variables in
> code that isn't optimised get their own storage (and so will always
> have a stack or register location). Wheras in the coroutine code
> you're generating the variable address doesn't get storage.
>
> If [0] is useful for you I can get that landed; it'd be good to hear
> whether this resolves the dbg.addr intrinsics not having an affect on
> the output.
>
> [0]
https://github.com/jmorse/llvm-project/commit/40927e6c2b71ec914d937287a0c2ca6c52c01f6b
>
> --
> Thanks,
> Jeremy

Vedant Kumar via llvm-dev

2020-Feb-26 22:08 UTC

head link

[llvm-dev] Why is lldb telling me "variable not available"?

I haven't fully parsed this thread (sorry!), but I wanted to briefly mention
that the SafeStack & ASan passes both do something similar (I think): move
local variables backed by allocas onto a separate stack. These passes use
replaceDbgDeclare to rewrite dbg.declares s.t. they point into the new stack.
After that, llvm presumably runs LowerDbgDeclare (usually via InstCombine), but
all the inserted dbg.values have useful locations relative to the new stack.

It could be worth investigating whether replaceDbgDeclare is a good fit for the
coro-split pass.

vedant
> On Feb 26, 2020, at 1:32 PM, Brian Gesiak via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> Awesome, thanks Jeremy.
> 
> On Wed, Feb 26, 2020 at 11:02 AM Jeremy Morse
> <jeremy.morse.llvm at gmail.com <mailto:jeremy.morse.llvm at
gmail.com>> wrote:
>> 
>> Hi Brian,
>> 
>> On Tue, Feb 25, 2020 at 7:43 PM Brian Gesiak <modocache at
gmail.com> wrote:
>>> In other words, the value of %i is stored on the frame object, on
the
>>> heap, at an offset of 7 into the frame. I'm beginning to think
a
>>> fundamental fix for this issue would be to stop replacing
>>> llvm.dbg.declare with llvm.dbg.value, and instead replace the
>>> llvm.dbg.declare with llvm.dbg.addr that points the debugger to the
%i
>>> variable's new permanent location as an offset into the
coroutine
>>> frame object. Does this approach make sense to people on this
mailing
>>> list, who probably know more about how these intrinsics work than I
>>> do?
>> 
>> This matches a few similar use cases that I'm aware of -- certain
>> kinds of struct that are passed-by-value according to the language,
>> but passed-by-reference according to ABI, are treated in that way. In
>> general, the downside is that the debugger can only observe variable
>> values when they get written to memory, not when they're computed,
as
>> dbg.values and dbg.declares aren't supposed to be mixed. Observing
>> variable values slightly later might be an improvement over the
>> current situation.
> 
> This is very reassuring, thank you!
> 
>> Although, I don't think this will work immediately, see below,
>> 
>>> I tried multiple approaches to manually inserting an llvm.dbg.addr
>>> after the store instruction, as per your suggestion, Jeremy. I used
>>> llc to compile the IR into an object file that I then linked, and
>>> inspected the DWARF generated for the file. Unfortunately,
inserting
>>> dbg.addr that operated on the reloaded values didn't lead to
any
>>> change in the DWARF that was produced --  specifically, this
didn't
>>> make a difference:
>>> 
>>> call void @llvm.dbg.addr(metadata i32* %i.reload.addr62, metadata
>>> !873, metadata !DIExpression()), !dbg !884
>> 
>> Ouch, I tried this myself, and ran into the same difficulty. I'd
>> missed that all your functions are marked "optnone" / -O0,
which means
>> a different instruction-selection pass (FastISel) runs, and it turns
>> out FastISel isn't aware of dbg.addrs existence. Even better,
FastISel
>> doesn't manage to lower any debug intrinsic (including dbg.declare)
>> that refers to a GEP, because it doesn't have a register location
(the
>> GEP gets folded into a memory addressing mode).
>> 
>> I've hacked together some support in [0], that allows
dbg.addr's of
>> GEPs to be handled. A single dbg.addr at the start of the function
>> (and no dbg.values) should get you the same behaviour as a
>> dbg.declare.
> 
> Maybe I'm running into the limitations of how far I can get here by
> hacking up the LLVM IR directly, but I've made a couple of tries here
> without any luck. I put the IR files and the DWARF dumps of
> executables built from that IR in another GitHub gist,
> https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275
<https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275>.
> Here's a summary of the attempts I've made -- all of them with your
> patch included in my llvm-project tree:
> 
> 1. Don't lower llvm.dbg.declare to llvm.dbg.value at all, no other
> changes -- doesn't work, in the way I expected it to not work
> 
> I tried removing the call to llvm::LowerDbgDeclare that's being made
> by the coroutine passes. The IR and DWARF dump after having done so
> are in the files 'repro.declare.ll' and
'repro.declare.dwarf.txt'.
> 
> Using an executable built from this IR, I can use lldb to break at the
> line where the declare exists,
>
https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275#file-repro-declare-ll-L167
<https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275#file-repro-declare-ll-L167>,
> using the command 'b repro.cpp:24'. Running 'e i' at this
point prints
> an incorrect value, '(int) $0 = 24742', but at least 'i' is
found.
> 
> Then, continuing past a coroutine "suspend point" (a point after
which
> the value of 'i' is stored into an offset of the coroutine frame
> object, an outlined coroutine function 'foo.resume' is invoked,
'i' is
> loaded out of an offset on the coroutine frame object, and normally a
> llvm.dbg.value call would have been generated for that load), and then
> breaking at 'b repro.cpp:32', executing 'e i' results in
'error: <user
> expression 1>:1:1: use of undeclared identifier 'i''. This
fits my
> mental model, I think: 'i' was declared in the function
'foo', and its
> DWARF information
>
https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275#file-repro-declare-dwarf-txt-L3711
<https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275#file-repro-declare-dwarf-txt-L3711>
> only specifies its offset from the 'foo' frame. After the suspend
> point, lldb is stopped in 'foo.resume', and the 'foo' frame
is no
> longer active, so lldb determines 'i' is out of scope. Makes sense!
> 
> 2. Don't lower llvm.dbg.declare at all, sed to automatically replace
> all llvm.dbg.declare with llvm.dbg.addr -- doesn't work, in an
> unexpected way
> 
> The outputs from this are in the files repro.addr.ll and
> repro.addr.dwarf.txt. My understanding based on
> https://llvm.org/docs/SourceLevelDebugging.html#llvm-dbg-addr
<https://llvm.org/docs/SourceLevelDebugging.html#llvm-dbg-addr> is that
> llvm.dbg.addr should work at least as well as llvm.dbg.declare does.
> However, when I use it and break at
>
https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275#file-repro-addr-ll-L167
<https://gist.github.com/modocache/8a7b12eb42012990ba534787c4a47275#file-repro-addr-ll-L167>
> ('b repro.cpp:24'), executing 'e i' in lldb outputs
'error: <user
> expression 1>:1:1: use of undeclared identifier 'i''. This
threw me
> for a loop -- maybe I'm misunderstanding the relationship between
> llvm.dbg.addr and llvm.dbg.declare?
> 
> - Brian Gesiak
> 
>> I suspect the reason why this problem hasn't shown up in
>> the past is because the coroutine code being generated hits a gap
>> between "optimised" and "not optimised": I believe
all variables in
>> code that isn't optimised get their own storage (and so will always
>> have a stack or register location). Wheras in the coroutine code
>> you're generating the variable address doesn't get storage.
>> 
>> If [0] is useful for you I can get that landed; it'd be good to
hear
>> whether this resolves the dbg.addr intrinsics not having an affect on
>> the output.
>> 
>> [0]
https://github.com/jmorse/llvm-project/commit/40927e6c2b71ec914d937287a0c2ca6c52c01f6b
>> 
>> --
>> Thanks,
>> Jeremy
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200226/7a94d202/attachment.html>

Possibly Parallel Threads

Search for more maybe matching threads

llvm dev - Feb 2020 - Why is lldb telling me "variable not available"?

[llvm-dev] Why is lldb telling me "variable not available"?

[llvm-dev] Why is lldb telling me "variable not available"?

[llvm-dev] Why is lldb telling me "variable not available"?

Possibly Parallel Threads