Djordje via llvm-dev
2020-Jun-18 08:58 UTC
[llvm-dev] [DebugInfo] RFC: Introduce LLVM DI Checker utility
Hi Vedant, Thanks a lot for your comments! >It looks like a lot of the new infrastructure introduced here <https://github.com/djolertrk/llvm-di-checker/commit/9d26ac2557c584f6cf82ac5535fc47f8bd267a27> consists of logic copied from the debugify implementation. Why is introducing a new pair of passes better than extending the ones we have? The core infrastructure needed to track location loss for real (non-synthetic) source variables is is in place already. Since it is a proposal, I thought it'd easier to understand the idea if I duplicate things. Ideally, we can make an API that could be used from both tools. Initially, I made a few patches locally turning the real debug info into debugify ones, but I realized it breaks the original idea/design of the debugify & that is why I decided to make a separate pass(es). This cannot stay as is with the respect to the implementation, it should be either merged into debugify file(s) or refactored using the API mentioned above. Another reason for implementing it as a different pass was the fact the debugify is meant to be used from 'opt' level only, but if we want to invoke the option from front end level, we need to merge it into the IR library. >Stepping back a bit, I’m also surprised by the decision to move away from synthetic testing when there’s still so much low-hanging fruit to pick using that technique. The example from https://reviews.llvm.org/D81939 illustrates this perfectly: in this case it’s not necessary to invent a new testing technique to uncover the bug, because simply running `./bin/llvm-lit -Dopt="opt -debugify-each" test/Transforms/DeadArgElim` finds the same issue. As I mentioned in the previous mail, I do really think the debugify technique is great & I use it. But, in order to detect that variable "x" was optimized-out starting from pass Y, I only run the di-checker option (that performs analysis only) & find the variable in the final html report. I think that is very user friendly concept. At the end, when we detected what was the spot of loosing the location, we can run debugify on the pass-directory-tests (but there is a concern the tests does not cover all the possible cases; and the case found from the high level could be new to the pass). In addition, the di-checker detects issues for metadata other than locations (currently, the preservation map keeps the disubprograms only, but it should keep other kinds too). >In D81939, you discuss finding the new tool useful when responding to bug reports about optimized-out variables or missing locations. We sorely do need something better than -opt-bisect-limit, but why not start with something simple? -check-debugify already knows how to report when & where a location is dropped, it would be simple to teach it to emit a report when a variable is fully optimized-out. I agree. We can do that and that could be used from both utilities. Best regards, Djordje On 17.6.20. 21:14, Vedant Kumar wrote:> Hey Djordje, > > It looks like a lot of the new infrastructure introduced here > <https://github.com/djolertrk/llvm-di-checker/commit/9d26ac2557c584f6cf82ac5535fc47f8bd267a27> consists > of logic copied from the debugify implementation. Why is introducing a > new pair of passes better than extending the ones we have? The core > infrastructure needed to track location loss for real (non-synthetic) > source variables is is in place already. > > Stepping back a bit, I’m also surprised by the decision to move away > from synthetic testing when there’s still so much low-hanging fruit to > pick using that technique. The example from > https://reviews.llvm.org/D81939 illustrates this perfectly: in this > case it’s not necessary to invent a new testing technique to uncover > the bug, because simply running `./bin/llvm-lit -Dopt="opt > -debugify-each" test/Transforms/DeadArgElim` finds the same issue. > > In D81939, you discuss finding the new tool useful when responding to > bug reports about optimized-out variables or missing locations. We > sorely do need something better than -opt-bisect-limit, but why not > start with something simple? -check-debugify already knows how to > report when & where a location is dropped, it would be simple to teach > it to emit a report when a variable is fully optimized-out. > > >> On Jun 17, 2020, at 2:10 AM, Djordje <djordje.todorovic at syrmia.com >> <mailto:djordje.todorovic at syrmia.com>> wrote: >> >> I am sharing the proposal [0] which gives a brief introduction for >> the implementation of the LLVM DI Checker utility. On a very high >> level, it is a pair of LLVM (IR) Passes that check the preservation >> of the original debug info in the optimizations. There are options >> controlling the passes, that could be invoked from ``clang`` as well >> as from ``opt`` level. >> >> By testing the utility on the GDB 7.11 project (using it as a >> testbed), it has found a certain number of potential issues regarding >> the DILocations (using it on LLVM project build itself, it has found >> one bug regarding DISubprogram metadata). Please take a look into the >> final report (on the GDB 7.11 testbed) generated from the script that >> collects the data at [1]. By looking at these data, it looks that the >> utility like this could be useful when trying to detect the real >> issues related to debug info production by the compiler. > > Thanks for sharing these results. The data here is older (from the > 2018 debug info BoF) and from a different project (sqlite3), but we > saw some similar patterns: > https://llvm.org/devmtg/2018-10/slides/Prantl-Kumar-debug-info-bof-2018.pdf > > best > vedant-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200618/2d5beac6/attachment.html>
Vedant Kumar via llvm-dev
2020-Jun-19 06:27 UTC
[llvm-dev] [DebugInfo] RFC: Introduce LLVM DI Checker utility
> On Jun 18, 2020, at 1:58 AM, Djordje <djordje.todorovic at syrmia.com> wrote: > > Hi Vedant, > > Thanks a lot for your comments! > > >It looks like a lot of the new infrastructure introduced here <https://github.com/djolertrk/llvm-di-checker/commit/9d26ac2557c584f6cf82ac5535fc47f8bd267a27> consists of logic copied from the debugify implementation. Why is introducing a new pair of passes better than extending the ones we have? The core infrastructure needed to track location loss for real (non-synthetic) source variables is is in place already. > > Since it is a proposal, I thought it'd easier to understand the idea if I duplicate things. Ideally, we can make an API that could be used from both tools. Initially, I made a few patches locally turning the real debug info into debugify ones, but I realized it breaks the original idea/design of the debugify & that is why I decided to make a separate pass(es). This cannot stay as is with the respect to the implementation, it should be either merged into debugify file(s) or refactored using the API mentioned above. >Oh, this wasn’t clear to me. In the future, please describe what is/isn’t part of the proposal. It’d be especially helpful to include some discussion of the pros & cons of the prototype design and its alternatives.> Another reason for implementing it as a different pass was the fact the debugify is meant to be used from 'opt' level only, but if we want to invoke the option from front end level, we need to merge it into the IR library. >The debugify passes are now linked by llc via the Transforms library as part of the -mir-debugify implementation. A frontend should be able to use them.> >Stepping back a bit, I’m also surprised by the decision to move away from synthetic testing when there’s still so much low-hanging fruit to pick using that technique. The example from https://reviews.llvm.org/D81939 <https://reviews.llvm.org/D81939> illustrates this perfectly: in this case it’s not necessary to invent a new testing technique to uncover the bug, because simply running `./bin/llvm-lit -Dopt="opt -debugify-each" test/Transforms/DeadArgElim` finds the same issue. > As I mentioned in the previous mail, I do really think the debugify technique is great & I use it. But, in order to detect that variable "x" was optimized-out starting from pass Y, I only run the di-checker option (that performs analysis only) & find the variable in the final html report. I think that is very user friendly concept. >About the analysis — why not emit a report in -check-debugify when the # of non-undef debug uses of a variable drops to 0? This doesn’t require the -debugify step. The list of local variables is preserved via DISubprogram::retainedNodes.> At the end, when we detected what was the spot of loosing the location, we can run debugify on the pass-directory-tests (but there is a concern the tests does not cover all the possible cases; and the case found from the high level could be new to the pass). In addition, the di-checker detects issues for metadata other than locations (currently, the preservation map keeps the disubprograms only, but it should keep other kinds too). >Note that it’s possible to to increase code coverage by running a -debugify-each pipeline on -g0 IR produced by a frontend. Is it common for a pass to drop an entire DISubprogram? I would hope this either never happens or is extremely rare. best, vedant> >In D81939, you discuss finding the new tool useful when responding to bug reports about optimized-out variables or missing locations. We sorely do need something better than -opt-bisect-limit, but why not start with something simple? -check-debugify already knows how to report when & where a location is dropped, it would be simple to teach it to emit a report when a variable is fully optimized-out. > > I agree. We can do that and that could be used from both utilities. > > > > Best regards, > > Djordje > > > > On 17.6.20. 21:14, Vedant Kumar wrote: >> Hey Djordje, >> >> It looks like a lot of the new infrastructure introduced here <https://github.com/djolertrk/llvm-di-checker/commit/9d26ac2557c584f6cf82ac5535fc47f8bd267a27> consists of logic copied from the debugify implementation. Why is introducing a new pair of passes better than extending the ones we have? The core infrastructure needed to track location loss for real (non-synthetic) source variables is is in place already. >> >> Stepping back a bit, I’m also surprised by the decision to move away from synthetic testing when there’s still so much low-hanging fruit to pick using that technique. The example from https://reviews.llvm.org/D81939 <https://reviews.llvm.org/D81939> illustrates this perfectly: in this case it’s not necessary to invent a new testing technique to uncover the bug, because simply running `./bin/llvm-lit -Dopt="opt -debugify-each" test/Transforms/DeadArgElim` finds the same issue. >> >> In D81939, you discuss finding the new tool useful when responding to bug reports about optimized-out variables or missing locations. We sorely do need something better than -opt-bisect-limit, but why not start with something simple? -check-debugify already knows how to report when & where a location is dropped, it would be simple to teach it to emit a report when a variable is fully optimized-out. >> >> >>> On Jun 17, 2020, at 2:10 AM, Djordje <djordje.todorovic at syrmia.com <mailto:djordje.todorovic at syrmia.com>> wrote: >>> >>> I am sharing the proposal [0] which gives a brief introduction for the implementation of the LLVM DI Checker utility. On a very high level, it is a pair of LLVM (IR) Passes that check the preservation of the original debug info in the optimizations. There are options controlling the passes, that could be invoked from ``clang`` as well as from ``opt`` level. >>> >>> By testing the utility on the GDB 7.11 project (using it as a testbed), it has found a certain number of potential issues regarding the DILocations (using it on LLVM project build itself, it has found one bug regarding DISubprogram metadata). Please take a look into the final report (on the GDB 7.11 testbed) generated from the script that collects the data at [1]. By looking at these data, it looks that the utility like this could be useful when trying to detect the real issues related to debug info production by the compiler. >> >> Thanks for sharing these results. The data here is older (from the 2018 debug info BoF) and from a different project (sqlite3), but we saw some similar patterns: https://llvm.org/devmtg/2018-10/slides/Prantl-Kumar-debug-info-bof-2018.pdf <https://llvm.org/devmtg/2018-10/slides/Prantl-Kumar-debug-info-bof-2018.pdf> >> >> best >> vedant-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200618/eec53877/attachment.html>
Djordje via llvm-dev
2020-Jun-19 07:33 UTC
[llvm-dev] [DebugInfo] RFC: Introduce LLVM DI Checker utility
On 19.6.20. 08:27, Vedant Kumar wrote:> > >> On Jun 18, 2020, at 1:58 AM, Djordje <djordje.todorovic at syrmia.com >> <mailto:djordje.todorovic at syrmia.com>> wrote: >> >> Hi Vedant, >> >> Thanks a lot for your comments! >> >> >It looks like a lot of the new infrastructure introduced here >> <https://github.com/djolertrk/llvm-di-checker/commit/9d26ac2557c584f6cf82ac5535fc47f8bd267a27> consists >> of logic copied from the debugify implementation. Why is introducing >> a new pair of passes better than extending the ones we have? The core >> infrastructure needed to track location loss for real (non-synthetic) >> source variables is is in place already. >> >> Since it is a proposal, I thought it'd easier to understand the idea >> if I duplicate things. Ideally, we can make an API that could be used >> from both tools. Initially, I made a few patches locally turning the >> real debug info into debugify ones, but I realized it breaks the >> original idea/design of the debugify & that is why I decided to make >> a separate pass(es). This cannot stay as is with the respect to the >> implementation, it should be either merged into debugify file(s) or >> refactored using the API mentioned above. >> > Oh, this wasn’t clear to me. In the future, please describe what > is/isn’t part of the proposal. It’d be especially helpful to include > some discussion of the pros & cons of the prototype design and its > alternatives. >Sorry If I wasn't clear enough. I thought if I write [PROPOSAL], all of it is considered as a proposal.>> >> Another reason for implementing it as a different pass was the fact >> the debugify is meant to be used from 'opt' level only, but if we >> want to invoke the option from front end level, we need to merge it >> into the IR library. >> > The debugify passes are now linked by llc via the Transforms library > as part of the -mir-debugify implementation. A frontend should be able > to use them.I'll try that, thanks.> >> >Stepping back a bit, I’m also surprised by the decision to move away >> from synthetic testing when there’s still so much low-hanging fruit >> to pick using that technique. The example from >> https://reviews.llvm.org/D81939 illustrates this perfectly: in this >> case it’s not necessary to invent a new testing technique to uncover >> the bug, because simply running `./bin/llvm-lit -Dopt="opt >> -debugify-each" test/Transforms/DeadArgElim` finds the same issue. >> >> As I mentioned in the previous mail, I do really think the debugify >> technique is great & I use it. But, in order to detect that variable >> "x" was optimized-out starting from pass Y, I only run the di-checker >> option (that performs analysis only) & find the variable in the final >> html report. I think that is very user friendly concept. >> > About the analysis — why not emit a report in -check-debugify when the > # of non-undef debug uses of a variable drops to 0? This doesn’t > require the -debugify step. The list of local variables is preserved > via DISubprogram::retainedNodes.Hmm, good idea. But we'd need to workaround the fact/condition the debugify(& check-debugify) works with/expects the synthetic debug info only. Let me try merging these ideas into the code, by removing the duplicated code (I'll try to use the debugify/check-debugify as much as possible by performing analysis only).> >> At the end, when we detected what was the spot of loosing the >> location, we can run debugify on the pass-directory-tests (but there >> is a concern the tests does not cover all the possible cases; and the >> case found from the high level could be new to the pass). In >> addition, the di-checker detects issues for metadata other than >> locations (currently, the preservation map keeps the disubprograms >> only, but it should keep other kinds too). >> > Note that it’s possible to to increase code coverage by running a > -debugify-each pipeline on -g0 IR produced by a frontend. > > Is it common for a pass to drop an entire DISubprogram? I would hope > this either never happens or is extremely rare.It is extremely rare, but there are passes that create new functions, and it possible to forget to update/attach the subprogram on that (the similar situation we face with locations when someone forgets to set debugloc on an instruction). Best, Djordje> > best, > vedant > >> >In D81939, you discuss finding the new tool useful when responding >> to bug reports about optimized-out variables or missing locations. We >> sorely do need something better than -opt-bisect-limit, but why not >> start with something simple? -check-debugify already knows how to >> report when & where a location is dropped, it would be simple to >> teach it to emit a report when a variable is fully optimized-out. >> >> I agree. We can do that and that could be used from both utilities. >> >> >> Best regards, >> >> Djordje >> >> >> On 17.6.20. 21:14, Vedant Kumar wrote: >>> Hey Djordje, >>> >>> It looks like a lot of the new infrastructure introduced here >>> <https://github.com/djolertrk/llvm-di-checker/commit/9d26ac2557c584f6cf82ac5535fc47f8bd267a27> consists >>> of logic copied from the debugify implementation. Why is introducing >>> a new pair of passes better than extending the ones we have? The >>> core infrastructure needed to track location loss for real >>> (non-synthetic) source variables is is in place already. >>> >>> Stepping back a bit, I’m also surprised by the decision to move away >>> from synthetic testing when there’s still so much low-hanging fruit >>> to pick using that technique. The example from >>> https://reviews.llvm.org/D81939 illustrates this perfectly: in this >>> case it’s not necessary to invent a new testing technique to uncover >>> the bug, because simply running `./bin/llvm-lit -Dopt="opt >>> -debugify-each" test/Transforms/DeadArgElim` finds the same issue. >>> >>> In D81939, you discuss finding the new tool useful when responding >>> to bug reports about optimized-out variables or missing locations. >>> We sorely do need something better than -opt-bisect-limit, but why >>> not start with something simple? -check-debugify already knows how >>> to report when & where a location is dropped, it would be simple to >>> teach it to emit a report when a variable is fully optimized-out. >>> >>> >>>> On Jun 17, 2020, at 2:10 AM, Djordje <djordje.todorovic at syrmia.com >>>> <mailto:djordje.todorovic at syrmia.com>> wrote: >>>> >>>> I am sharing the proposal [0] which gives a brief introduction for >>>> the implementation of the LLVM DI Checker utility. On a very high >>>> level, it is a pair of LLVM (IR) Passes that check the preservation >>>> of the original debug info in the optimizations. There are options >>>> controlling the passes, that could be invoked from ``clang`` as >>>> well as from ``opt`` level. >>>> >>>> By testing the utility on the GDB 7.11 project (using it as a >>>> testbed), it has found a certain number of potential issues >>>> regarding the DILocations (using it on LLVM project build itself, >>>> it has found one bug regarding DISubprogram metadata). Please take >>>> a look into the final report (on the GDB 7.11 testbed) generated >>>> from the script that collects the data at [1]. By looking at these >>>> data, it looks that the utility like this could be useful when >>>> trying to detect the real issues related to debug info production >>>> by the compiler. >>> >>> Thanks for sharing these results. The data here is older (from the >>> 2018 debug info BoF) and from a different project (sqlite3), but we >>> saw some similar patterns: >>> https://llvm.org/devmtg/2018-10/slides/Prantl-Kumar-debug-info-bof-2018.pdf >>> >>> best >>> vedant >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200619/ca5cbd87/attachment.html>