thr3ads.net - llvm dev - [llvm-dev] [DebugInfo] RFC: Introduce LLVM DI Checker utility [Jun 2020]

If this information is useful, please help other people find it:
Share via:

Djordje via llvm-dev

2020-Jun-17 09:10 UTC

[llvm-dev] [DebugInfo] RFC: Introduce LLVM DI Checker utility

Hi,

I am sharing the proposal [0] which gives a brief introduction for the 
implementation of the LLVM DI Checker utility. On a very high level, it 
is a pair of LLVM (IR) Passes that check the preservation of the 
original debug info in the optimizations. There are options controlling 
the passes, that could be invoked from ``clang`` as well as from ``opt`` 
level.

By testing the utility on the GDB 7.11 project (using it as a testbed), 
it has found a certain number of potential issues regarding the 
DILocations (using it on LLVM project build itself, it has found one bug 
regarding DISubprogram metadata). Please take a look into the final 
report (on the GDB 7.11 testbed) generated from the script that collects 
the data at [1]. By looking at these data, it looks that the utility 
like this could be useful when trying to detect the real issues related 
to debug info production by the compiler.

Any thoughts on this? Thanks in advance!

[0] https://github.com/djolertrk/llvm-di-checker
[1] https://djolertrk.github.io/di-checker-html-report-example/

Best regards,
Djordje

Adrian Prantl via llvm-dev

2020-Jun-17 17:03 UTC

head link

[llvm-dev] [DebugInfo] RFC: Introduce LLVM DI Checker utility

That's a neat idea!

How would a tool like this distinguish between situations where debug locations
are expected to be dropped or merged, such as the ones outlined in
https://reviews.llvm.org/D81198? Is it generating false positives?

You mention that "An alternative to this is the debugify utility, but the
difference is that the LLVM DI Checker deals with real debug info, rather than
with the synthetic ones". How is that an advantage? Are you seeing too many
false positives with the debugify-generated debug locations?

-- adrian

Vedant Kumar via llvm-dev

2020-Jun-17 19:14 UTC

head link

[llvm-dev] [DebugInfo] RFC: Introduce LLVM DI Checker utility

Hey Djordje,

It looks like a lot of the new infrastructure introduced here
<https://github.com/djolertrk/llvm-di-checker/commit/9d26ac2557c584f6cf82ac5535fc47f8bd267a27>
consists of logic copied from the debugify implementation. Why is introducing a
new pair of passes better than extending the ones we have? The core
infrastructure needed to track location loss for real (non-synthetic) source
variables is is in place already.

Stepping back a bit, I’m also surprised by the decision to move away from
synthetic testing when there’s still so much low-hanging fruit to pick using
that technique. The example from https://reviews.llvm.org/D81939
<https://reviews.llvm.org/D81939> illustrates this perfectly: in this case
it’s not necessary to invent a new testing technique to uncover the bug, because
simply running `./bin/llvm-lit -Dopt="opt -debugify-each"
test/Transforms/DeadArgElim` finds the same issue.

In D81939, you discuss finding the new tool useful when responding to bug
reports about optimized-out variables or missing locations. We sorely do need
something better than -opt-bisect-limit, but why not start with something
simple? -check-debugify already knows how to report when & where a location
is dropped, it would be simple to teach it to emit a report when a variable is
fully optimized-out.

> On Jun 17, 2020, at 2:10 AM, Djordje <djordje.todorovic at
syrmia.com> wrote:
> 
> I am sharing the proposal [0] which gives a brief introduction for the
implementation of the LLVM DI Checker utility. On a very high level, it is a
pair of LLVM (IR) Passes that check the preservation of the original debug info
in the optimizations. There are options controlling the passes, that could be
invoked from ``clang`` as well as from ``opt`` level.
> 
> By testing the utility on the GDB 7.11 project (using it as a testbed), it
has found a certain number of potential issues regarding the DILocations (using
it on LLVM project build itself, it has found one bug regarding DISubprogram
metadata). Please take a look into the final report (on the GDB 7.11 testbed)
generated from the script that collects the data at [1]. By looking at these
data, it looks that the utility like this could be useful when trying to detect
the real issues related to debug info production by the compiler.
Thanks for sharing these results. The data here is older (from the 2018 debug
info BoF) and from a different project (sqlite3), but we saw some similar
patterns:
https://llvm.org/devmtg/2018-10/slides/Prantl-Kumar-debug-info-bof-2018.pdf
<https://llvm.org/devmtg/2018-10/slides/Prantl-Kumar-debug-info-bof-2018.pdf>

best
vedant
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200617/e2e15fbf/attachment.html>

Djordje via llvm-dev

2020-Jun-18 08:15 UTC

head link

[llvm-dev] [DebugInfo] RFC: Introduce LLVM DI Checker utility

Hi Adrian,

Thanks for the comments!

 > How would a tool like this distinguish between situations where debug 
locations are expected to be dropped or merged, such as the ones 
outlined in https://reviews.llvm.org/D81198? Is it generating false 
positives?

Since it is still a proposal, it does not cover these cases, but it 
shouldn't generate false positives in that case. My impression is that 
we can check if dropping/merging a location meets requirements outlined 
within D81198 (e.g. to check whether the instruction is in the same 
basic block when dropping occurs etc.) & mark it as a "known
dropping".

 > You mention that "An alternative to this is the debugify utility, but
the difference is that the LLVM DI Checker deals with real debug info, 
rather than with the synthetic ones". How is that an advantage? Are you 
seeing too many false positives with the debugify-generated debug locations?

I was wrong when saying "alternative". These two are more likely to be
used in the combination. There are no false positives from debugify 
report (at least I haven't seen it; the same core logic was used for 
di-checker), but I think that since debugify deals with synthetic debug 
info it is potentially limited to certain set of metadata kinds that 
could be generated synthetically (but I might have been mistaken about 
that) & it is part of Transformation lib, but the di-checker performs 
analysis only (I am not sure what is the overhead if we run debugify on 
a large project on every single CU; my impression was that this analysis 
was chipper) &  the di-checker reports failures (instead of e.g for 
variables called "1", "2", etc.) for real entities such as
"a", "b",
etc. (and these are the entities being reported from users as "My var 
'a' is optimized out..." or "I cannot attach breakpoint to
function
'fn1()'"). I don't want to make a picture that we are choosing
between
these two, since I really think the debugify is great tool & these two 
can/should coexist. I use the di-checker to detect failures from clang's 
level & then I run debugify on the certain pass-test-directory. As I 
just mentioned, the di-checker option could be called from clang's 
level, since it has been linked into the IR library. In addition, the 
di-checker should be extended to support all kinds of debug info 
metadata, such as DILexicalScopes, DIGlobalVariables, dbg_labels, and so on.

Best,

Djordje

On 17.6.20. 19:03, Adrian Prantl wrote:> That's a neat idea!
>
> How would a tool like this distinguish between situations where debug
locations are expected to be dropped or merged, such as the ones outlined in
https://reviews.llvm.org/D81198? Is it generating false positives?
>
> You mention that "An alternative to this is the debugify utility, but
the difference is that the LLVM DI Checker deals with real debug info, rather
than with the synthetic ones". How is that an advantage? Are you seeing too
many false positives with the debugify-generated debug locations?
>
> -- adrian
>
>

Djordje via llvm-dev

2020-Jun-18 08:58 UTC

head link

[llvm-dev] [DebugInfo] RFC: Introduce LLVM DI Checker utility

Hi Vedant,

Thanks a lot for your comments!

 >It looks like a lot of the new infrastructure introduced here 
<https://github.com/djolertrk/llvm-di-checker/commit/9d26ac2557c584f6cf82ac5535fc47f8bd267a27> consists
of logic copied from the debugify implementation. Why is introducing a 
new pair of passes better than extending the ones we have? The core 
infrastructure needed to track location loss for real (non-synthetic) 
source variables is is in place already.

Since it is a proposal, I thought it'd easier to understand the idea if 
I duplicate things. Ideally, we can make an API that could be used from 
both tools. Initially, I made a few patches locally turning the real 
debug info into debugify ones, but I realized it breaks the original 
idea/design of the debugify & that is why I decided to make a separate 
pass(es). This cannot stay as is with the respect to the implementation, 
it should be either merged into debugify file(s) or refactored using the 
API mentioned above. Another reason for implementing it as a different 
pass was the fact the debugify is meant to be used from 'opt' level 
only, but if we want to invoke the option from front end level, we need 
to merge it into the IR library.


 >Stepping back a bit, I’m also surprised by the decision to move away 
from synthetic testing when there’s still so much low-hanging fruit to 
pick using that technique. The example from 
https://reviews.llvm.org/D81939 illustrates this perfectly: in this case 
it’s not necessary to invent a new testing technique to uncover the bug, 
because simply running `./bin/llvm-lit -Dopt="opt -debugify-each" 
test/Transforms/DeadArgElim` finds the same issue.

As I mentioned in the previous mail, I do really think the debugify 
technique is great & I use it. But, in order to detect that variable
"x"
was optimized-out starting from pass Y, I only run the di-checker option 
(that performs analysis only) & find the variable in the final html 
report. I think that is very user friendly concept. At the end, when we 
detected what was the spot of loosing the location, we can run debugify 
on the pass-directory-tests (but there is a concern the tests does not 
cover all the possible cases; and the case found from the high level 
could be new to the pass). In addition, the di-checker detects issues 
for metadata other than locations (currently, the preservation map keeps 
the disubprograms only, but it should keep other kinds too).


 >In D81939, you discuss finding the new tool useful when responding to 
bug reports about optimized-out variables or missing locations. We 
sorely do need something better than -opt-bisect-limit, but why not 
start with something simple? -check-debugify already knows how to report 
when & where a location is dropped, it would be simple to teach it to 
emit a report when a variable is fully optimized-out.

I agree. We can do that and that could be used from both utilities.


Best regards,

Djordje


On 17.6.20. 21:14, Vedant Kumar wrote:> Hey Djordje,
>
> It looks like a lot of the new infrastructure introduced here 
>
<https://github.com/djolertrk/llvm-di-checker/commit/9d26ac2557c584f6cf82ac5535fc47f8bd267a27> consists
> of logic copied from the debugify implementation. Why is introducing a 
> new pair of passes better than extending the ones we have? The core 
> infrastructure needed to track location loss for real (non-synthetic) 
> source variables is is in place already.
>
> Stepping back a bit, I’m also surprised by the decision to move away 
> from synthetic testing when there’s still so much low-hanging fruit to 
> pick using that technique. The example from 
> https://reviews.llvm.org/D81939 illustrates this perfectly: in this 
> case it’s not necessary to invent a new testing technique to uncover 
> the bug, because simply running `./bin/llvm-lit -Dopt="opt 
> -debugify-each" test/Transforms/DeadArgElim` finds the same issue.
>
> In D81939, you discuss finding the new tool useful when responding to 
> bug reports about optimized-out variables or missing locations. We 
> sorely do need something better than -opt-bisect-limit, but why not 
> start with something simple? -check-debugify already knows how to 
> report when & where a location is dropped, it would be simple to teach 
> it to emit a report when a variable is fully optimized-out.
>
>
>> On Jun 17, 2020, at 2:10 AM, Djordje <djordje.todorovic at
syrmia.com
>> <mailto:djordje.todorovic at syrmia.com>> wrote:
>>
>> I am sharing the proposal [0] which gives a brief introduction for 
>> the implementation of the LLVM DI Checker utility. On a very high 
>> level, it is a pair of LLVM (IR) Passes that check the preservation 
>> of the original debug info in the optimizations. There are options 
>> controlling the passes, that could be invoked from ``clang`` as well 
>> as from ``opt`` level.
>>
>> By testing the utility on the GDB 7.11 project (using it as a 
>> testbed), it has found a certain number of potential issues regarding 
>> the DILocations (using it on LLVM project build itself, it has found 
>> one bug regarding DISubprogram metadata). Please take a look into the 
>> final report (on the GDB 7.11 testbed) generated from the script that 
>> collects the data at [1]. By looking at these data, it looks that the 
>> utility like this could be useful when trying to detect the real 
>> issues related to debug info production by the compiler.
>
> Thanks for sharing these results. The data here is older (from the 
> 2018 debug info BoF) and from a different project (sqlite3), but we 
> saw some similar patterns: 
> https://llvm.org/devmtg/2018-10/slides/Prantl-Kumar-debug-info-bof-2018.pdf
>
> best
> vedant-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200618/2d5beac6/attachment.html>

llvm dev - Jun 2020 - [DebugInfo] RFC: Introduce LLVM DI Checker utility

[llvm-dev] [DebugInfo] RFC: Introduce LLVM DI Checker utility

[llvm-dev] [DebugInfo] RFC: Introduce LLVM DI Checker utility

[llvm-dev] [DebugInfo] RFC: Introduce LLVM DI Checker utility

[llvm-dev] [DebugInfo] RFC: Introduce LLVM DI Checker utility

[llvm-dev] [DebugInfo] RFC: Introduce LLVM DI Checker utility