Krzysztof Parzyszek via llvm-dev
2016-Nov-02 15:14 UTC
[llvm-dev] BoF: Debug info for optimized code.
Hi Martin, Yes, the patch only changes the format of line information. There will be more work needed for fully implementing it across all tools. Here your concern still stands---more focus on debug information for VLIW architectures would be welcome. I was only pointing out that the necessary capacity of the debug information to carry this data does in fact exist, and that at least one step for getting it into LLVM has been attempted (the patch was reverted shortly after commit). -Krzysztof On 11/2/2016 4:03 AM, Martin J. O'Riordan via llvm-dev wrote:> Thanks Krzysztof, I hadn't noticed this. > > The patch refers to the target providing an 'op_index' register, but this seems like something that can only be handled by an integrated assembler. We use an external assembler and I am curious if there are new directives that we need to support for this? At the moment our assembler is unable to accept '.loc' directives between each operation in a VLIW instruction, is this something that we need to implement to get this level of VLIW debug support? > > Thanks, > > MartinO > > -----Original Message----- > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Krzysztof Parzyszek via llvm-dev > Sent: 01 November 2016 21:35 > To: llvm-dev at lists.llvm.org > Subject: Re: [llvm-dev] BoF: Debug info for optimized code. > > On 11/1/2016 4:28 PM, Martin J. O'Riordan via llvm-dev wrote: >> I do not even pretend to know much about Dwarf and the representation of debug information, but it does appear that there is little or no support for the idea that a single "instruction" can correspond to multiple diverse lines in the source file. > > There is. There is even a patch for LLVM: > https://reviews.llvm.org/D16697 > > -Krzysztof > > -- > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Robinson, Paul via llvm-dev
2016-Nov-10 22:07 UTC
[llvm-dev] BoF: Debug info for optimized code.
At the BoF session, Reid Kleckner wrote a few notes on the whiteboard and then I got a photo of it before the next session started up. I've transcribed those notes here, and expanded on them a bit with my own thoughts. If anybody else has notes/thoughts, please share them. Whiteboard notes ---------------- Variable info metrics - Induction variable tracking - Contrast -O0 vs -O2 variables, breakpoint locations - Track line info for side effects only (semantic stepping) "key" instructions Unpacking that a bit... Induction variable tracking --------------------------- Somebody (Hal?) observed that in counted loops (I = 1 to N) the counter often gets transformed into something else more useful (e.g. an offset instead of an index). DWARF is powerful enough to express how to recover the original counter value, if only the induction transformation had a way to describe what it did (or more precisely, how to recover the original value after what it did). Contrast -O0 vs -O2 variables, breakpoint locations --------------------------------------------------- This came up during a discussion on debug-info-quality testing/metrics. One metric for quality of debug info of optimized code is to compare what is "available" at -O0 to what what is "available" at -O2. This can be applied to both kinds of debug info affected by optimizations: whether a variable is available (has a defined location) and whether a breakpoint is available (the line has a defined "is-a-statement" address). If you look at the set of instructions where a variable has a valid location, how does that set compare to the set of instructions for the lexical scope that contains the variable? If you look at the sets of breakpoint locations described by the line table, how does the set for -O2 compare to the set for -O0? It's not hard to imagine tooling that would permit comparisons of this kind, and some people have had tooling like that in previous jobs. Track line info for side effects only (aka semantic stepping or "key" instructions) --------------------------------------------- This idea is based on two observations: (1) Optimization tends to shuffle instructions around, so that you end up with instructions "from" a given source line being mixed in with instructions "from" other source lines. If we very precisely track the source line for every instruction, then single-stepping through "the source" in a debugger becomes very back-and-forth and choosing a good place to set a breakpoint on "the line" becomes a dicey proposition. (2) If you look at the set of instructions generated for a given line, it's easy to conclude that "some are more equal than others." This means for something like a simple assignment, the load is kind of important, the ZEXT not so much, and the store is really the thing. So, picking and choosing which instructions to mark as good stopping places could well improve the user-experience without significantly interfering with the user's ability to see what their program is doing. [Okay, I'm really going beyond what we said in the BoF, but I think it's a worthwhile point to expand upon.] Let's unpack an assignment from an 'unsigned short' to an 'unsigned long' as an example. This basically turns into a load/ZEXT/store sequence. If you have an optimization that hoists the load+ZEXT above an 'if' or loop-top, but leaves the store down inside the 'then' part or loop body, is it really important to tag the load+ZEXT with the original source line? If you want to stop on "the line," doing it just before the store is really the critical thing. That is, the store is the "key" or "semantically significant" instruction here, and the load/ZEXT are not so important. You can have a smooth, user-friendly debugging experience if you mark the store as a good stopping point for that statement, and don't mark the load/ZEXT that way (even though, pedantically, the load/ZEXT are also "from" the same source statement). Now, how far you take this idea and in what circumstances is arguable because it very quickly is in the arena of human-factors quality, and people may differ in their preferences for "precise" versus "smooth" single-stepping or breakpoint-location experience. But these things definitely have an effect on the experience and we have to be willing to trade off one for the other in some cases. Thanks, --paulr
Hal Finkel via llvm-dev
2016-Nov-10 22:30 UTC
[llvm-dev] BoF: Debug info for optimized code.
----- Original Message -----> From: "Paul via llvm-dev Robinson" <llvm-dev at lists.llvm.org> > To: llvm-dev at lists.llvm.org > Sent: Thursday, November 10, 2016 4:07:06 PM > Subject: Re: [llvm-dev] BoF: Debug info for optimized code. > > At the BoF session, Reid Kleckner wrote a few notes on the whiteboard > and then I got a photo of it before the next session started up. > I've > transcribed those notes here, and expanded on them a bit with my own > thoughts. If anybody else has notes/thoughts, please share them. > > Whiteboard notes > ---------------- > Variable info metrics > - Induction variable tracking > - Contrast -O0 vs -O2 variables, breakpoint locations > - Track line info for side effects only > (semantic stepping) "key" instructions > > > Unpacking that a bit... > > Induction variable tracking > --------------------------- > Somebody (Hal?) observed that in counted loops (I = 1 to N) the > counterYes, it was me. It was pointed out (in conversations after the BoF) that we already have some pass (SROA?) that builds expressions for things; but that's pretty limited. We'll need utilities to build more-general expressions (and maybe some kind of SCEV visitor to build them), and also for full generality, debug intrinsics that take multiple value operands so that we can write DWARF expressions that refer to multiple values (which is currently not possible). Thanks again, Hal> often gets transformed into something else more useful (e.g. an > offset > instead of an index). DWARF is powerful enough to express how to > recover > the original counter value, if only the induction transformation had > a way > to describe what it did (or more precisely, how to recover the > original > value after what it did). > > > Contrast -O0 vs -O2 variables, breakpoint locations > --------------------------------------------------- > This came up during a discussion on debug-info-quality > testing/metrics. > One metric for quality of debug info of optimized code is to compare > what > is "available" at -O0 to what what is "available" at -O2. This can > be > applied to both kinds of debug info affected by optimizations: > whether a > variable is available (has a defined location) and whether a > breakpoint > is available (the line has a defined "is-a-statement" address). > > If you look at the set of instructions where a variable has a valid > location, how does that set compare to the set of instructions for > the > lexical scope that contains the variable? If you look at the sets of > breakpoint locations described by the line table, how does the set > for > -O2 compare to the set for -O0? > > It's not hard to imagine tooling that would permit comparisons of > this > kind, and some people have had tooling like that in previous jobs. > > > Track line info for side effects only > (aka semantic stepping or "key" instructions) > --------------------------------------------- > This idea is based on two observations: > (1) Optimization tends to shuffle instructions around, so that you > end > up with instructions "from" a given source line being mixed in > with > instructions "from" other source lines. If we very precisely > track > the source line for every instruction, then single-stepping > through > "the source" in a debugger becomes very back-and-forth and > choosing > a good place to set a breakpoint on "the line" becomes a dicey > proposition. > (2) If you look at the set of instructions generated for a given > line, > it's easy to conclude that "some are more equal than others." > This > means for something like a simple assignment, the load is kind of > important, the ZEXT not so much, and the store is really the > thing. > So, picking and choosing which instructions to mark as good stopping > places could well improve the user-experience without significantly > interfering with the user's ability to see what their program is > doing. > > [Okay, I'm really going beyond what we said in the BoF, but I think > it's > a worthwhile point to expand upon.] > > Let's unpack an assignment from an 'unsigned short' to an 'unsigned > long' > as an example. This basically turns into a load/ZEXT/store sequence. > > If you have an optimization that hoists the load+ZEXT above an 'if' > or > loop-top, but leaves the store down inside the 'then' part or loop > body, > is it really important to tag the load+ZEXT with the original source > line? If you want to stop on "the line," doing it just before the > store > is really the critical thing. > > That is, the store is the "key" or "semantically significant" > instruction > here, and the load/ZEXT are not so important. You can have a smooth, > user-friendly debugging experience if you mark the store as a good > stopping point for that statement, and don't mark the load/ZEXT that > way > (even though, pedantically, the load/ZEXT are also "from" the same > source > statement). > > Now, how far you take this idea and in what circumstances is arguable > because it very quickly is in the arena of human-factors quality, and > people may differ in their preferences for "precise" versus "smooth" > single-stepping or breakpoint-location experience. But these things > definitely have an effect on the experience and we have to be willing > to trade off one for the other in some cases. > > Thanks, > --paulr > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >