thr3ads.net - llvm dev - [llvm-dev] Proposal for multi location debug info support in LLVM IR [Jan 2016]

If this information is useful, please help other people find it:
Share via:

Keno Fischer via llvm-dev

2016-Jan-05 18:37 UTC

[llvm-dev] Proposal for multi location debug info support in LLVM IR

On Tue, Jan 5, 2016 at 6:59 PM, Adrian Prantl <aprantl at apple.com>
wrote:
> Thanks for the clarification, Paul!
> Keno, just a few more questions for my understanding:
>
> >     - Indicating that a value changed at source level (e.g. because an
> >       assignment occurred)
>
> This is done by a key call.

Correct

> >     - Indicating that the same value is now available in a new
location
>
> Additional, alternative locations with identical contents are added by
> passing in the token from a key call.

Correct

> >     - Indicating that a value is no longer available in some location
>
> This is done by another key call (possibly with an %undef location).

Not quite. Another key call could be used if all locations are now invalid.
However, to just remove a single value, I was proposing

; This is the key call
%first = call token @llvm.dbg.value(token undef, %someloc,
                                  metadata !var, metadata !())

; This adds a location
%second = call token @llvm.dbg.value(token %second, %someotherloc,
                                  metadata !var, metadata !())

; This removes the (%second) location
%third = call token @llvm.dbg.value(token %second, metadata token undef,
                                  metadata !var, metadata !())

Thus, to remove a location you always pass in the token of the call that
added the location. This is also the reason why I'm requiring the second
argument to be `token undef` because no valid location can be of type
token, and I wanted to avoid the situation in which a location gets
replaced by undef everywhere, accidentally turning into a removal of the
location specified by the key call.

> > > >
> > > >     - To add a location with the same value for the same
variable,
> you
> > > pass the
> > > >       token of the FIRST llvm.dbg.value, as this
llvm.dbg.value's
> first
> > > argument
> > > >       E.g. to add another location for the variable above:
> > > >
> > > >         %second =3D call token @llvm.dbg.value(token %first,
metadata
> > > %val2,
> > > >                                             metadata !var,
metadata
> > > !expr2)
> > >
> > > Does this invalidate the first location, or does this add an
additional
> > > location
> > > to the set of locations for var at this point? If I want to add a
third
> > > location,
> > > which token do I pass in? Can you explain a bit more what
information
> the
> > > token
> > > allows us to express that is currently not possible?
> > >
> >
> > It adds a second location. If you want to add a third location you
pass
> in
> > the first token again.
> > Thus the first call (key call) indicates a change of values, and all
> > locations that have the same value should use the key call's
token.
> >
>
> Ok. Looks like this is going to be somewhat verbose for partial updates of
> SROA’ed aggregates as in the following example:
>
> // struct s { int i, j };
> // void foo(struct s) { s.j = 0; ... }
>
> define void @foo(i32 %i, i32 %j) {
>   %token = call llvm.dbg.value(token %undef, %i, !Struct,
> !DIExpression(DW_OP_bit_piece(0, 32)))
>            call llvm.dbg.value(token %token, %j, !Struct,
> !DIExpression(DW_OP_bit_piece(32, 32)))
>   ...
>
>   ; have to repeat %i here:
>   %tok2 = call llvm.dbg.value(token %undef, %i, !Struct,
> !DIExpression(DW_OP_bit_piece(0, 32)))
>           call llvm.dbg.value(token %tok2, metadata i32 0, !Struct,
> !DIExpression(DW_OP_bit_piece(32, 32)))
>
> On the upside, having all this information explicit could simplify the
> code in DwarfDebug::buildLocationList().
>
Yeah, this is true. We could potentially extend the semantics by allowing
separate key calls for pieces, i.e.

%token = call llvm.dbg.value(token %undef, %i, !Struct,
!DIExpression(DW_OP_bit_piece(0, 32)))
           call llvm.dbg.value(token undef, %j, !Struct,
!DIExpression(DW_OP_bit_piece(32, 32)))

; This now only invalidates the .j part
%tok2 = call llvm.dbg.value(token %undef, %j, !Struct,
!DIExpression(DW_OP_bit_piece(32, 32)))

In that case we would probably have to require that all DW_OP_bit_pieces in
non-key-call expressions are a subrange of those in the associated key call.

Is there any information in the tokens that could not be recovered by
a> static analysis of the debug intrinsics?
> Note that having redundant information available explicitly is not
> necessarily a bad thing.
>
I am not entirely sure what you are proposing. You somehow need to be able
to encode which dbg.values invalidate previous locations and which do not.
Since we're describing front-end variables this will generally depend on
front-end semantics, so I'm not sure what a generic analysis pass can do
here without requiring language-specific analysis.

> The one difference I noticed so far is that alternative locations allow
> earlier locations to outlive locations that are dominated by them:
>   %loc = dbg.value(%undef, var, ...)
>   ...
>   %alt = dbg.value(%loc, var, ...)
>   ...
>   ; alt becomes unavailable
>   ...
>   ; %loc is still available here.
>
> Any other advantages that I missed?
>
> -- adrian-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160105/50ffcc7b/attachment.html>

Adrian Prantl via llvm-dev

2016-Jan-06 21:58 UTC

head link

[llvm-dev] Proposal for multi location debug info support in LLVM IR

> On Jan 5, 2016, at 10:37 AM, Keno Fischer <kfischer at
college.harvard.edu> wrote:
> 
> On Tue, Jan 5, 2016 at 6:59 PM, Adrian Prantl <aprantl at apple.com
<mailto:aprantl at apple.com>> wrote:
> Thanks for the clarification, Paul!
> Keno, just a few more questions for my understanding:
> 
> >     - Indicating that a value changed at source level (e.g. because an
> >       assignment occurred)
> 
> This is done by a key call.
> 
> Correct
>  
> >     - Indicating that the same value is now available in a new
location
> 
> Additional, alternative locations with identical contents are added by
passing in the token from a key call.
> 
> Correct
>  
> >     - Indicating that a value is no longer available in some location
> 
> This is done by another key call (possibly with an %undef location).
> 
> Not quite. Another key call could be used if all locations are now invalid.
However, to just remove a single value, I was proposing
> 
> ; This is the key call
> %first = call token @llvm.dbg.value(token undef, %someloc,
>                                   metadata !var, metadata !())
> 
> ; This adds a location
> %second = call token @llvm.dbg.value(token %second, %someotherloc,
>                                   metadata !var, metadata !())
> 
> ; This removes the (%second) location
> %third = call token @llvm.dbg.value(token %second, metadata token undef,
>                                   metadata !var, metadata !())
> 
> Thus, to remove a location you always pass in the token of the call that
added the location. This is also the reason why I'm requiring the second
argument to be `token undef` because no valid location can be of type token, and
I wanted to avoid the situation in which a location gets replaced by undef
everywhere, accidentally turning into a removal of the location specified by the
key call
Makes sense. If I understand your comment correctly, the following snippet:

%1 = ...
%token = call llvm.dbg.value(token %undef, %1, !var, !())
%2 = ...
call llvm.dbg.value(token %token, %undef, !var, !())
call llvm.dbg.value(token %undef, %2, !var, !())

is equivalent to

%1 = ...
call llvm.dbg.value(token %undef, %1, !var, !())
%2 = ...
call llvm.dbg.value(token %undef, %2, !var, !())

and both are legal.
> > > >
> > > >     - To add a location with the same value for the same
variable, you
> > > pass the
> > > >       token of the FIRST llvm.dbg.value, as this
llvm.dbg.value's first
> > > argument
> > > >       E.g. to add another location for the variable above:
> > > >
> > > >         %second =3D call token @llvm.dbg.value(token %first,
metadata
> > > %val2,
> > > >                                             metadata !var,
metadata
> > > !expr2)
> > >
> > > Does this invalidate the first location, or does this add an
additional
> > > location
> > > to the set of locations for var at this point? If I want to add a
third
> > > location,
> > > which token do I pass in? Can you explain a bit more what
information the
> > > token
> > > allows us to express that is currently not possible?
> > >
> >
> > It adds a second location. If you want to add a third location you
pass in
> > the first token again.
> > Thus the first call (key call) indicates a change of values, and all
> > locations that have the same value should use the key call's
token.
> >
> 
> Ok. Looks like this is going to be somewhat verbose for partial updates of
SROA’ed aggregates as in the following example:
> 
> // struct s { int i, j };
> // void foo(struct s) { s.j = 0; ... }
> 
> define void @foo(i32 %i, i32 %j) {
>   %token = call llvm.dbg.value(token %undef, %i, !Struct,
!DIExpression(DW_OP_bit_piece(0, 32)))
>            call llvm.dbg.value(token %token, %j, !Struct,
!DIExpression(DW_OP_bit_piece(32, 32)))
>   ...
> 
>   ; have to repeat %i here:
>   %tok2 = call llvm.dbg.value(token %undef, %i, !Struct,
!DIExpression(DW_OP_bit_piece(0, 32)))
>           call llvm.dbg.value(token %tok2, metadata i32 0, !Struct,
!DIExpression(DW_OP_bit_piece(32, 32)))
> 
> On the upside, having all this information explicit could simplify the code
in DwarfDebug::buildLocationList().
> 
> Yeah, this is true. We could potentially extend the semantics by allowing
separate key calls for pieces, i.e.
>  
> %token = call llvm.dbg.value(token %undef, %i, !Struct,
!DIExpression(DW_OP_bit_piece(0, 32)))
>            call llvm.dbg.value(token undef, %j, !Struct,
!DIExpression(DW_OP_bit_piece(32, 32)))
> 
> ; This now only invalidates the .j part
> %tok2 = call llvm.dbg.value(token %undef, %j, !Struct,
!DIExpression(DW_OP_bit_piece(32, 32)))
> 
> In that case we would probably have to require that all DW_OP_bit_pieces in
non-key-call expressions are a subrange of those in the associated key call.
This way all non-key-call additional locations are describing alternative
locations for (a subset of) the bits described the key-call location. Makes
sense, and again would simplify the backend’s work.
> 
> Is there any information in the tokens that could not be recovered by a
static analysis of the debug intrinsics?
> Note that having redundant information available explicitly is not
necessarily a bad thing.
> 
> I am not entirely sure what you are proposing. You somehow need to be able
to encode which dbg.values invalidate previous locations and which do not. Since
we're describing front-end variables this will generally depend on front-end
semantics, so I'm not sure what a generic analysis pass can do here without
requiring language-specific analysis.
Right. Determining whether two locations have equivalent contents is not
generally decidable.
> The one difference I noticed so far is that alternative locations allow
earlier locations to outlive locations that are dominated by them:
>   %loc = dbg.value(%undef, var, ...)
>   ...
>   %alt = dbg.value(%loc, var, ...)
>   ...
>   ; alt becomes unavailable
>   ...
>   ; %loc is still available here.
> 
> Any other advantages that I missed?
> 
> -- adrian

One thing I’m wondering about is whether we couldn’t design a friendlier
(assembler) syntax for the three different use-cases:
  %tok1 = call llvm.dbg.value(token %undef, %1, !var, !())
  %tok2 = call llvm.dbg.value(token %token, %2, !var, !())
  %tok3 = call llvm.dbg.value(token %tok1, %undef, !var, !())

Could be written as e.g.:

  %tok1 = call llvm.dbg.value.new(%1, !var, !())
  %tok2 = call llvm.dbg.value.add(token %token, %2, !var, !())
  %tok3 = call llvm.dbg.value.delete(token %tok1, !var, !())

-- adrian
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160106/19273a48/attachment.html>

Keno Fischer via llvm-dev

2016-Jan-06 22:05 UTC

head link

[llvm-dev] Proposal for multi location debug info support in LLVM IR

>
> Makes sense. If I understand your comment correctly, the following snippet:
>
> %1 = ...
> %token = call llvm.dbg.value(token %undef, %1, !var, !())
> %2 = ...
> call llvm.dbg.value(token %token, %undef, !var, !())
> call llvm.dbg.value(token %undef, %2, !var, !())
>
> is equivalent to
>
> %1 = ...
> call llvm.dbg.value(token %undef, %1, !var, !())
> %2 = ...
> call llvm.dbg.value(token %undef, %2, !var, !())
>
> and both are legal.
>
Yes

> > > >
>> > > >     - To add a location with the same value for the same
variable,
>> you
>> > > pass the
>> > > >       token of the FIRST llvm.dbg.value, as this
llvm.dbg.value's
>> first
>> > > argument
>> > > >       E.g. to add another location for the variable
above:
>> > > >
>> > > >         %second =3D call token @llvm.dbg.value(token
%first,
>> metadata
>> > > %val2,
>> > > >                                             metadata
!var, metadata
>> > > !expr2)
>> > >
>> > > Does this invalidate the first location, or does this add an
>> additional
>> > > location
>> > > to the set of locations for var at this point? If I want to
add a
>> third
>> > > location,
>> > > which token do I pass in? Can you explain a bit more what
information
>> the
>> > > token
>> > > allows us to express that is currently not possible?
>> > >
>> >
>> > It adds a second location. If you want to add a third location you
pass
>> in
>> > the first token again.
>> > Thus the first call (key call) indicates a change of values, and
all
>> > locations that have the same value should use the key call's
token.
>> >
>>
>> Ok. Looks like this is going to be somewhat verbose for partial updates
>> of SROA’ed aggregates as in the following example:
>>
>> // struct s { int i, j };
>> // void foo(struct s) { s.j = 0; ... }
>>
>> define void @foo(i32 %i, i32 %j) {
>>   %token = call llvm.dbg.value(token %undef, %i, !Struct,
>> !DIExpression(DW_OP_bit_piece(0, 32)))
>>            call llvm.dbg.value(token %token, %j, !Struct,
>> !DIExpression(DW_OP_bit_piece(32, 32)))
>>   ...
>>
>>   ; have to repeat %i here:
>>   %tok2 = call llvm.dbg.value(token %undef, %i, !Struct,
>> !DIExpression(DW_OP_bit_piece(0, 32)))
>>           call llvm.dbg.value(token %tok2, metadata i32 0, !Struct,
>> !DIExpression(DW_OP_bit_piece(32, 32)))
>>
>> On the upside, having all this information explicit could simplify the
>> code in DwarfDebug::buildLocationList().
>>
>
> Yeah, this is true. We could potentially extend the semantics by allowing
> separate key calls for pieces, i.e.
>
> %token = call llvm.dbg.value(token %undef, %i, !Struct,
> !DIExpression(DW_OP_bit_piece(0, 32)))
>            call llvm.dbg.value(token undef, %j, !Struct,
> !DIExpression(DW_OP_bit_piece(32, 32)))
>
> ; This now only invalidates the .j part
> %tok2 = call llvm.dbg.value(token %undef, %j, !Struct,
> !DIExpression(DW_OP_bit_piece(32, 32)))
>
> In that case we would probably have to require that all DW_OP_bit_pieces
> in non-key-call expressions are a subrange of those in the associated key
> call.
>
>
> This way all non-key-call additional locations are describing alternative
> locations for (a subset of) the bits described the key-call location. Makes
> sense, and again would simplify the backend’s work.
>
Yes, I think that's a reasonable change to the semantics, so let's make
it
so.

> One thing I’m wondering about is whether we couldn’t design a friendlier
> (assembler) syntax for the three different use-cases:
>   %tok1 = call llvm.dbg.value(token %undef, %1, !var, !())
>   %tok2 = call llvm.dbg.value(token %token, %2, !var, !())
>   %tok3 = call llvm.dbg.value(token %tok1, %undef, !var, !())
>
> Could be written as e.g.:
>
>   %tok1 = call llvm.dbg.value.new(%1, !var, !())
>   %tok2 = call llvm.dbg.value.add(token %token, %2, !var, !())
>   %tok3 = call llvm.dbg.value.delete(token %tok1, !var, !())
>
Yeah, I would be ok with that (and think it's a good idea).

> -- adrian
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160106/bd05d132/attachment.html>

llvm dev - Jan 2016 - Proposal for multi location debug info support in LLVM IR

[llvm-dev] Proposal for multi location debug info support in LLVM IR

[llvm-dev] Proposal for multi location debug info support in LLVM IR

[llvm-dev] Proposal for multi location debug info support in LLVM IR