thr3ads.net - llvm dev - [llvm-dev] BoF: Debug info for optimized code. [Nov 2016]

If this information is useful, please help other people find it:
Share via:

Krzysztof Parzyszek via llvm-dev

2016-Nov-02 15:14 UTC

[llvm-dev] BoF: Debug info for optimized code.

Hi Martin,
Yes, the patch only changes the format of line information.  There will 
be more work needed for fully implementing it across all tools.
Here your concern still stands---more focus on debug information for 
VLIW architectures would be welcome.  I was only pointing out that the 
necessary capacity of the debug information to carry this data does in 
fact exist, and that at least one step for getting it into LLVM has been 
attempted (the patch was reverted shortly after commit).

-Krzysztof


On 11/2/2016 4:03 AM, Martin J. O'Riordan via llvm-dev
wrote:> Thanks Krzysztof, I hadn't noticed this.
>
> The patch refers to the target providing an 'op_index' register,
but this seems like something that can only be handled by an integrated
assembler.  We use an external assembler and I am curious if there are new
directives that we need to support for this?  At the moment our assembler is
unable to accept '.loc' directives between each operation in a VLIW
instruction, is this something that we need to implement to get this level of
VLIW debug support?
>
> Thanks,
>
> 	MartinO
>
> -----Original Message-----
> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of
Krzysztof Parzyszek via llvm-dev
> Sent: 01 November 2016 21:35
> To: llvm-dev at lists.llvm.org
> Subject: Re: [llvm-dev] BoF: Debug info for optimized code.
>
> On 11/1/2016 4:28 PM, Martin J. O'Riordan via llvm-dev wrote:
>> I do not even pretend to know much about Dwarf and the representation
of debug information, but it does appear that there is little or no support for
the idea that a single "instruction" can correspond to multiple
diverse lines in the source file.
>
> There is.  There is even a patch for LLVM:
> https://reviews.llvm.org/D16697
>
> -Krzysztof
>
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted
by The Linux Foundation _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, 
hosted by The Linux Foundation

Robinson, Paul via llvm-dev

2016-Nov-10 22:07 UTC

head link

[llvm-dev] BoF: Debug info for optimized code.

At the BoF session, Reid Kleckner wrote a few notes on the whiteboard
and then I got a photo of it before the next session started up.  I've
transcribed those notes here, and expanded on them a bit with my own
thoughts.  If anybody else has notes/thoughts, please share them.

Whiteboard notes
----------------
Variable info metrics
- Induction variable tracking
- Contrast -O0 vs -O2 variables, breakpoint locations
- Track line info for side effects only
  (semantic stepping) "key" instructions


Unpacking that a bit...

Induction variable tracking
---------------------------
Somebody (Hal?) observed that in counted loops (I = 1 to N) the counter
often gets transformed into something else more useful (e.g. an offset 
instead of an index).  DWARF is powerful enough to express how to recover 
the original counter value, if only the induction transformation had a way
to describe what it did (or more precisely, how to recover the original
value after what it did).


Contrast -O0 vs -O2 variables, breakpoint locations
---------------------------------------------------
This came up during a discussion on debug-info-quality testing/metrics.
One metric for quality of debug info of optimized code is to compare what 
is "available" at -O0 to what what is "available" at -O2. 
This can be
applied to both kinds of debug info affected by optimizations: whether a
variable is available (has a defined location) and whether a breakpoint
is available (the line has a defined "is-a-statement" address).

If you look at the set of instructions where a variable has a valid 
location, how does that set compare to the set of instructions for the 
lexical scope that contains the variable?  If you look at the sets of 
breakpoint locations described by the line table, how does the set for 
-O2 compare to the set for -O0?

It's not hard to imagine tooling that would permit comparisons of this
kind, and some people have had tooling like that in previous jobs.


Track line info for side effects only
(aka semantic stepping or "key" instructions)
---------------------------------------------
This idea is based on two observations:
(1) Optimization tends to shuffle instructions around, so that you end
    up with instructions "from" a given source line being mixed in
with
    instructions "from" other source lines.  If we very precisely
track
    the source line for every instruction, then single-stepping through
    "the source" in a debugger becomes very back-and-forth and
choosing
    a good place to set a breakpoint on "the line" becomes a dicey
    proposition.
(2) If you look at the set of instructions generated for a given line,
    it's easy to conclude that "some are more equal than others." 
This
    means for something like a simple assignment, the load is kind of
    important, the ZEXT not so much, and the store is really the thing.
So, picking and choosing which instructions to mark as good stopping
places could well improve the user-experience without significantly
interfering with the user's ability to see what their program is doing.

[Okay, I'm really going beyond what we said in the BoF, but I think it's
a worthwhile point to expand upon.]

Let's unpack an assignment from an 'unsigned short' to an
'unsigned long'
as an example.  This basically turns into a load/ZEXT/store sequence.

If you have an optimization that hoists the load+ZEXT above an 'if' or
loop-top, but leaves the store down inside the 'then' part or loop body,
is it really important to tag the load+ZEXT with the original source
line?  If you want to stop on "the line," doing it just before the
store
is really the critical thing.

That is, the store is the "key" or "semantically
significant" instruction
here, and the load/ZEXT are not so important.  You can have a smooth,
user-friendly debugging experience if you mark the store as a good
stopping point for that statement, and don't mark the load/ZEXT that way
(even though, pedantically, the load/ZEXT are also "from" the same
source
statement).

Now, how far you take this idea and in what circumstances is arguable
because it very quickly is in the arena of human-factors quality, and
people may differ in their preferences for "precise" versus
"smooth"
single-stepping or breakpoint-location experience.  But these things
definitely have an effect on the experience and we have to be willing
to trade off one for the other in some cases.

Thanks,
--paulr

Hal Finkel via llvm-dev

2016-Nov-10 22:30 UTC

head link

[llvm-dev] BoF: Debug info for optimized code.

----- Original Message -----> From: "Paul via llvm-dev Robinson" <llvm-dev at
lists.llvm.org>
> To: llvm-dev at lists.llvm.org
> Sent: Thursday, November 10, 2016 4:07:06 PM
> Subject: Re: [llvm-dev] BoF: Debug info for optimized code.
> 
> At the BoF session, Reid Kleckner wrote a few notes on the whiteboard
> and then I got a photo of it before the next session started up.
>  I've
> transcribed those notes here, and expanded on them a bit with my own
> thoughts.  If anybody else has notes/thoughts, please share them.
> 
> Whiteboard notes
> ----------------
> Variable info metrics
> - Induction variable tracking
> - Contrast -O0 vs -O2 variables, breakpoint locations
> - Track line info for side effects only
>   (semantic stepping) "key" instructions
> 
> 
> Unpacking that a bit...
> 
> Induction variable tracking
> ---------------------------
> Somebody (Hal?) observed that in counted loops (I = 1 to N) the
> counter
Yes, it was me. It was pointed out (in conversations after the BoF) that we
already have some pass (SROA?) that builds expressions for things; but
that's pretty limited. We'll need utilities to build more-general
expressions (and maybe some kind of SCEV visitor to build them), and also for
full generality, debug intrinsics that take multiple value operands so that we
can write DWARF expressions that refer to multiple values (which is currently
not possible).

Thanks again,
Hal
> often gets transformed into something else more useful (e.g. an
> offset
> instead of an index).  DWARF is powerful enough to express how to
> recover
> the original counter value, if only the induction transformation had
> a way
> to describe what it did (or more precisely, how to recover the
> original
> value after what it did).
> 
> 
> Contrast -O0 vs -O2 variables, breakpoint locations
> ---------------------------------------------------
> This came up during a discussion on debug-info-quality
> testing/metrics.
> One metric for quality of debug info of optimized code is to compare
> what
> is "available" at -O0 to what what is "available" at
-O2.  This can
> be
> applied to both kinds of debug info affected by optimizations:
> whether a
> variable is available (has a defined location) and whether a
> breakpoint
> is available (the line has a defined "is-a-statement" address).
> 
> If you look at the set of instructions where a variable has a valid
> location, how does that set compare to the set of instructions for
> the
> lexical scope that contains the variable?  If you look at the sets of
> breakpoint locations described by the line table, how does the set
> for
> -O2 compare to the set for -O0?
> 
> It's not hard to imagine tooling that would permit comparisons of
> this
> kind, and some people have had tooling like that in previous jobs.
> 
> 
> Track line info for side effects only
> (aka semantic stepping or "key" instructions)
> ---------------------------------------------
> This idea is based on two observations:
> (1) Optimization tends to shuffle instructions around, so that you
> end
>     up with instructions "from" a given source line being mixed
in
>     with
>     instructions "from" other source lines.  If we very precisely
>     track
>     the source line for every instruction, then single-stepping
>     through
>     "the source" in a debugger becomes very back-and-forth and
>     choosing
>     a good place to set a breakpoint on "the line" becomes a
dicey
>     proposition.
> (2) If you look at the set of instructions generated for a given
> line,
>     it's easy to conclude that "some are more equal than
others."
>      This
>     means for something like a simple assignment, the load is kind of
>     important, the ZEXT not so much, and the store is really the
>     thing.
> So, picking and choosing which instructions to mark as good stopping
> places could well improve the user-experience without significantly
> interfering with the user's ability to see what their program is
> doing.
> 
> [Okay, I'm really going beyond what we said in the BoF, but I think
> it's
> a worthwhile point to expand upon.]
> 
> Let's unpack an assignment from an 'unsigned short' to an
'unsigned
> long'
> as an example.  This basically turns into a load/ZEXT/store sequence.
> 
> If you have an optimization that hoists the load+ZEXT above an 'if'
> or
> loop-top, but leaves the store down inside the 'then' part or loop
> body,
> is it really important to tag the load+ZEXT with the original source
> line?  If you want to stop on "the line," doing it just before
the
> store
> is really the critical thing.
> 
> That is, the store is the "key" or "semantically
significant"
> instruction
> here, and the load/ZEXT are not so important.  You can have a smooth,
> user-friendly debugging experience if you mark the store as a good
> stopping point for that statement, and don't mark the load/ZEXT that
> way
> (even though, pedantically, the load/ZEXT are also "from" the
same
> source
> statement).
> 
> Now, how far you take this idea and in what circumstances is arguable
> because it very quickly is in the arena of human-factors quality, and
> people may differ in their preferences for "precise" versus
"smooth"
> single-stepping or breakpoint-location experience.  But these things
> definitely have an effect on the experience and we have to be willing
> to trade off one for the other in some cases.
> 
> Thanks,
> --paulr
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

llvm dev - Nov 2016 - BoF: Debug info for optimized code.

[llvm-dev] BoF: Debug info for optimized code.

[llvm-dev] BoF: Debug info for optimized code.

[llvm-dev] BoF: Debug info for optimized code.