Vivek Sarkar via llvm-dev
2016-Jan-06 22:02 UTC
[llvm-dev] Proposal for multi location debug info support in LLVM IR
I will be out of the office on January 7th and will return on January 19th. I will not have access to email during this time. Please contact Karen Lavelle at klavelle at rice.edu or 713-348-2062 if you have any questions or concerns. Best regards, Annepha On Jan 6, 2016, at 3:58 PM, Adrian Prantl via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > On Jan 5, 2016, at 10:37 AM, Keno Fischer <kfischer at college.harvard.edu> wrote: > On Tue, Jan 5, 2016 at 6:59 PM, Adrian Prantl <aprantl at apple.com> wrote: > Thanks for the clarification, Paul! > Keno, just a few more questions for my understanding: > > > - Indicating that a value changed at source level (e.g. because an > > assignment occurred) > > This is done by a key call. > > Correct > > > - Indicating that the same value is now available in a new location > > Additional, alternative locations with identical contents are added by passing in the token from a key call. > > Correct > > > - Indicating that a value is no longer available in some location > > This is done by another key call (possibly with an %undef location). > > Not quite. Another key call could be used if all locations are now invalid. However, to just remove a single value, I was proposing > > ; This is the key call > %first = call token @llvm.dbg.value(token undef, %someloc, > metadata !var, metadata !()) > > ; This adds a location > %second = call token @llvm.dbg.value(token %second, %someotherloc, > metadata !var, metadata !()) > > ; This removes the (%second) location > %third = call token @llvm.dbg.value(token %second, metadata token undef, > metadata !var, metadata !()) > > Thus, to remove a location you always pass in the token of the call that added the location. This is also the reason why I'm requiring the second argument to be `token undef` because no valid location can be of type token, and I wanted to avoid the situation in which a location gets replaced by undef everywhere, accidentally turning into a removal of the location specified by the key call > > Makes sense. If I understand your comment correctly, the following snippet: > > %1 = ... > %token = call llvm.dbg.value(token %undef, %1, !var, !()) > %2 = ... > call llvm.dbg.value(token %token, %undef, !var, !()) > call llvm.dbg.value(token %undef, %2, !var, !()) > > is equivalent to > > %1 = ... > call llvm.dbg.value(token %undef, %1, !var, !()) > %2 = ... > call llvm.dbg.value(token %undef, %2, !var, !()) > > and both are legal. > > > > > > > > > - To add a location with the same value for the same variable, you > > > pass the > > > > token of the FIRST llvm.dbg.value, as this llvm.dbg.value's first > > > argument > > > > E.g. to add another location for the variable above: > > > > > > > > %second =3D call token @llvm.dbg.value(token %first, metadata > > > %val2, > > > > metadata !var, metadata > > > !expr2) > > > > > > Does this invalidate the first location, or does this add an additional > > > location > > > to the set of locations for var at this point? If I want to add a third > > > location, > > > which token do I pass in? Can you explain a bit more what information the > > > token > > > allows us to express that is currently not possible? > > > > > > > It adds a second location. If you want to add a third location you pass in > > the first token again. > > Thus the first call (key call) indicates a change of values, and all > > locations that have the same value should use the key call's token. > > > > Ok. Looks like this is going to be somewhat verbose for partial updates of SROA’ed aggregates as in the following example: > > // struct s { int i, j }; > // void foo(struct s) { s.j = 0; ... } > > define void @foo(i32 %i, i32 %j) { > %token = call llvm.dbg.value(token %undef, %i, !Struct, !DIExpression(DW_OP_bit_piece(0, 32))) > call llvm.dbg.value(token %token, %j, !Struct, !DIExpression(DW_OP_bit_piece(32, 32))) > ... > > ; have to repeat %i here: > %tok2 = call llvm.dbg.value(token %undef, %i, !Struct, !DIExpression(DW_OP_bit_piece(0, 32))) > call llvm.dbg.value(token %tok2, metadata i32 0, !Struct, !DIExpression(DW_OP_bit_piece(32, 32))) > > On the upside, having all this information explicit could simplify the code in DwarfDebug::buildLocationList(). > > Yeah, this is true. We could potentially extend the semantics by allowing separate key calls for pieces, i.e. > > %token = call llvm.dbg.value(token %undef, %i, !Struct, !DIExpression(DW_OP_bit_piece(0, 32))) > call llvm.dbg.value(token undef, %j, !Struct, !DIExpression(DW_OP_bit_piece(32, 32))) > > ; This now only invalidates the .j part > %tok2 = call llvm.dbg.value(token %undef, %j, !Struct, !DIExpression(DW_OP_bit_piece(32, 32))) > > In that case we would probably have to require that all DW_OP_bit_pieces in non-key-call expressions are a subrange of those in the associated key call. > > This way all non-key-call additional locations are describing alternative locations for (a subset of) the bits described the key-call location. Makes sense, and again would simplify the backend’s work. > > > Is there any information in the tokens that could not be recovered by a static analysis of the debug intrinsics? > Note that having redundant information available explicitly is not necessarily a bad thing. > > I am not entirely sure what you are proposing. You somehow need to be able to encode which dbg.values invalidate previous locations and which do not. Since we're describing front-end variables this will generally depend on front-end semantics, so I'm not sure what a generic analysis pass can do here without requiring language-specific analysis. > > Right. Determining whether two locations have equivalent contents is not generally decidable. > > The one difference I noticed so far is that alternative locations allow earlier locations to outlive locations that are dominated by them: > %loc = dbg.value(%undef, var, ...) > ... > %alt = dbg.value(%loc, var, ...) > ... > ; alt becomes unavailable > ... > ; %loc is still available here. > > Any other advantages that I missed? > > -- adrian > > > One thing I’m wondering about is whether we couldn’t design a friendlier (assembler) syntax for the three different use-cases: > %tok1 = call llvm.dbg.value(token %undef, %1, !var, !()) > %tok2 = call llvm.dbg.value(token %token, %2, !var, !()) > %tok3 = call llvm.dbg.value(token %tok1, %undef, !var, !()) > > Could be written as e.g.: > > %tok1 = call llvm.dbg.value.new(%1, !var, !()) > %tok2 = call llvm.dbg.value.add(token %token, %2, !var, !()) > %tok3 = call llvm.dbg.value.delete(token %tok1, !var, !()) > > -- adrian > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/enriched Size: 7761 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160106/de98a6c7/attachment.bin>
David Blaikie via llvm-dev
2016-Jan-15 22:38 UTC
[llvm-dev] Proposal for multi location debug info support in LLVM IR
I'm reading/following along - discussion so far sounds reasonable to me. Only minor note: if dbg.value/declare can be narrowed down to one (I think you mentioned in your original proposal that it seemed like everything could be just dbg.value?) that'd be a good step, regardless - possibly ahead of/while this conversation is underway. Or is it the case that the proposed enhanced semantics are required before that transition (because currently dbg.value only goes to the end of the BB? if I recall correctly, whereas dbg.declare is the whole function)? In the latter case, perhaps it'd be a good first step/goal/transition to do as cleanup/generalization anyway. - Dave On Wed, Jan 6, 2016 at 2:02 PM, Vivek Sarkar via llvm-dev < llvm-dev at lists.llvm.org> wrote:> I will be out of the office on January 7th and will return on January > 19th. I will not have access to email during this time. Please contact > Karen Lavelle at klavelle at rice.edu or 713-348-2062 if you have any > questions or concerns. > > Best regards, > Annepha > > On Jan 6, 2016, at 3:58 PM, Adrian Prantl via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > > > On Jan 5, 2016, at 10:37 AM, Keno Fischer <kfischer at college.harvard.edu> > wrote: > > On Tue, Jan 5, 2016 at 6:59 PM, Adrian Prantl <aprantl at apple.com> wrote: > > Thanks for the clarification, Paul! > > Keno, just a few more questions for my understanding: > > > > > - Indicating that a value changed at source level (e.g. because an > > > assignment occurred) > > > > This is done by a key call. > > > > Correct > > > > > - Indicating that the same value is now available in a new location > > > > Additional, alternative locations with identical contents are added by > passing in the token from a key call. > > > > Correct > > > > > - Indicating that a value is no longer available in some location > > > > This is done by another key call (possibly with an %undef location). > > > > Not quite. Another key call could be used if all locations are now > invalid. However, to just remove a single value, I was proposing > > > > ; This is the key call > > %first = call token @llvm.dbg.value(token undef, %someloc, > > metadata !var, metadata !()) > > > > ; This adds a location > > %second = call token @llvm.dbg.value(token %second, %someotherloc, > > metadata !var, metadata !()) > > > > ; This removes the (%second) location > > %third = call token @llvm.dbg.value(token %second, metadata token undef, > > metadata !var, metadata !()) > > > > Thus, to remove a location you always pass in the token of the call that > added the location. This is also the reason why I'm requiring the second > argument to be `token undef` because no valid location can be of type > token, and I wanted to avoid the situation in which a location gets > replaced by undef everywhere, accidentally turning into a removal of the > location specified by the key call > > > > Makes sense. If I understand your comment correctly, the following > snippet: > > > > %1 = ... > > %token = call llvm.dbg.value(token %undef, %1, !var, !()) > > %2 = ... > > call llvm.dbg.value(token %token, %undef, !var, !()) > > call llvm.dbg.value(token %undef, %2, !var, !()) > > > > is equivalent to > > > > %1 = ... > > call llvm.dbg.value(token %undef, %1, !var, !()) > > %2 = ... > > call llvm.dbg.value(token %undef, %2, !var, !()) > > > > and both are legal. > > > > > > > > > > > > - To add a location with the same value for the same variable, > you > > > > pass the > > > > > token of the FIRST llvm.dbg.value, as this llvm.dbg.value's > first > > > > argument > > > > > E.g. to add another location for the variable above: > > > > > > > > > > %second =3D call token @llvm.dbg.value(token %first, > metadata > > > > %val2, > > > > > metadata !var, metadata > > > > !expr2) > > > > > > > > Does this invalidate the first location, or does this add an > additional > > > > location > > > > to the set of locations for var at this point? If I want to add a > third > > > > location, > > > > which token do I pass in? Can you explain a bit more what > information the > > > > token > > > > allows us to express that is currently not possible? > > > > > > > > > > It adds a second location. If you want to add a third location you > pass in > > > the first token again. > > > Thus the first call (key call) indicates a change of values, and all > > > locations that have the same value should use the key call's token. > > > > > > > Ok. Looks like this is going to be somewhat verbose for partial updates > of SROA’ed aggregates as in the following example: > > > > // struct s { int i, j }; > > // void foo(struct s) { s.j = 0; ... } > > > > define void @foo(i32 %i, i32 %j) { > > %token = call llvm.dbg.value(token %undef, %i, !Struct, > !DIExpression(DW_OP_bit_piece(0, 32))) > > call llvm.dbg.value(token %token, %j, !Struct, > !DIExpression(DW_OP_bit_piece(32, 32))) > > ... > > > > ; have to repeat %i here: > > %tok2 = call llvm.dbg.value(token %undef, %i, !Struct, > !DIExpression(DW_OP_bit_piece(0, 32))) > > call llvm.dbg.value(token %tok2, metadata i32 0, !Struct, > !DIExpression(DW_OP_bit_piece(32, 32))) > > > > On the upside, having all this information explicit could simplify the > code in DwarfDebug::buildLocationList(). > > > > Yeah, this is true. We could potentially extend the semantics by > allowing separate key calls for pieces, i.e. > > > > %token = call llvm.dbg.value(token %undef, %i, !Struct, > !DIExpression(DW_OP_bit_piece(0, 32))) > > call llvm.dbg.value(token undef, %j, !Struct, > !DIExpression(DW_OP_bit_piece(32, 32))) > > > > ; This now only invalidates the .j part > > %tok2 = call llvm.dbg.value(token %undef, %j, !Struct, > !DIExpression(DW_OP_bit_piece(32, 32))) > > > > In that case we would probably have to require that all DW_OP_bit_pieces > in non-key-call expressions are a subrange of those in the associated key > call. > > > > This way all non-key-call additional locations are describing > alternative locations for (a subset of) the bits described the key-call > location. Makes sense, and again would simplify the backend’s work. > > > > > > Is there any information in the tokens that could not be recovered by a > static analysis of the debug intrinsics? > > Note that having redundant information available explicitly is not > necessarily a bad thing. > > > > I am not entirely sure what you are proposing. You somehow need to be > able to encode which dbg.values invalidate previous locations and which do > not. Since we're describing front-end variables this will generally depend > on front-end semantics, so I'm not sure what a generic analysis pass can do > here without requiring language-specific analysis. > > > > Right. Determining whether two locations have equivalent contents is not > generally decidable. > > > > The one difference I noticed so far is that alternative locations allow > earlier locations to outlive locations that are dominated by them: > > %loc = dbg.value(%undef, var, ...) > > ... > > %alt = dbg.value(%loc, var, ...) > > ... > > ; alt becomes unavailable > > ... > > ; %loc is still available here. > > > > Any other advantages that I missed? > > > > -- adrian > > > > > > One thing I’m wondering about is whether we couldn’t design a friendlier > (assembler) syntax for the three different use-cases: > > %tok1 = call llvm.dbg.value(token %undef, %1, !var, !()) > > %tok2 = call llvm.dbg.value(token %token, %2, !var, !()) > > %tok3 = call llvm.dbg.value(token %tok1, %undef, !var, !()) > > > > Could be written as e.g.: > > > > %tok1 = call llvm.dbg.value.new(%1, !var, !()) > > %tok2 = call llvm.dbg.value.add(token %token, %2, !var, !()) > > %tok3 = call llvm.dbg.value.delete(token %tok1, !var, !()) > > > > -- adrian > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160115/83c504e7/attachment-0001.html>
Keno Fischer via llvm-dev
2016-Jan-15 22:44 UTC
[llvm-dev] Proposal for multi location debug info support in LLVM IR
Adrian had proposed the following staging: 1. Remove offset argument from dbg.value 2. Unify dbg.value and dbg.declare 3. Full implementation I'm not yet sure what to do about the difference in dbg.declare semantics. For example, i think the following currently works ``` top: %x = alloca br else if: dbg.declare(%x... unreachable else: # dbg.declare still applies here ``` I think it would be reasonable to switch to the proposed dominance semantics during step 2, but we'll have to see if that negatively affects any real-world test cases. On Fri, Jan 15, 2016 at 11:38 PM, David Blaikie via llvm-dev < llvm-dev at lists.llvm.org> wrote:> I'm reading/following along - discussion so far sounds reasonable to me. > > Only minor note: if dbg.value/declare can be narrowed down to one (I think > you mentioned in your original proposal that it seemed like everything > could be just dbg.value?) that'd be a good step, regardless - possibly > ahead of/while this conversation is underway. Or is it the case that the > proposed enhanced semantics are required before that transition (because > currently dbg.value only goes to the end of the BB? if I recall correctly, > whereas dbg.declare is the whole function)? In the latter case, perhaps > it'd be a good first step/goal/transition to do as cleanup/generalization > anyway. > > - Dave > > On Wed, Jan 6, 2016 at 2:02 PM, Vivek Sarkar via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> I will be out of the office on January 7th and will return on January >> 19th. I will not have access to email during this time. Please contact >> Karen Lavelle at klavelle at rice.edu or 713-348-2062 if you have any >> questions or concerns. >> >> Best regards, >> Annepha >> >> On Jan 6, 2016, at 3:58 PM, Adrian Prantl via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >> > >> > On Jan 5, 2016, at 10:37 AM, Keno Fischer <kfischer at college.harvard.edu> >> wrote: >> > On Tue, Jan 5, 2016 at 6:59 PM, Adrian Prantl <aprantl at apple.com> >> wrote: >> > Thanks for the clarification, Paul! >> > Keno, just a few more questions for my understanding: >> > >> > > - Indicating that a value changed at source level (e.g. because an >> > > assignment occurred) >> > >> > This is done by a key call. >> > >> > Correct >> > >> > > - Indicating that the same value is now available in a new >> location >> > >> > Additional, alternative locations with identical contents are added by >> passing in the token from a key call. >> > >> > Correct >> > >> > > - Indicating that a value is no longer available in some location >> > >> > This is done by another key call (possibly with an %undef location). >> > >> > Not quite. Another key call could be used if all locations are now >> invalid. However, to just remove a single value, I was proposing >> > >> > ; This is the key call >> > %first = call token @llvm.dbg.value(token undef, %someloc, >> > metadata !var, metadata !()) >> > >> > ; This adds a location >> > %second = call token @llvm.dbg.value(token %second, %someotherloc, >> > metadata !var, metadata !()) >> > >> > ; This removes the (%second) location >> > %third = call token @llvm.dbg.value(token %second, metadata token undef, >> > metadata !var, metadata !()) >> > >> > Thus, to remove a location you always pass in the token of the call >> that added the location. This is also the reason why I'm requiring the >> second argument to be `token undef` because no valid location can be of >> type token, and I wanted to avoid the situation in which a location gets >> replaced by undef everywhere, accidentally turning into a removal of the >> location specified by the key call >> > >> > Makes sense. If I understand your comment correctly, the following >> snippet: >> > >> > %1 = ... >> > %token = call llvm.dbg.value(token %undef, %1, !var, !()) >> > %2 = ... >> > call llvm.dbg.value(token %token, %undef, !var, !()) >> > call llvm.dbg.value(token %undef, %2, !var, !()) >> > >> > is equivalent to >> > >> > %1 = ... >> > call llvm.dbg.value(token %undef, %1, !var, !()) >> > %2 = ... >> > call llvm.dbg.value(token %undef, %2, !var, !()) >> > >> > and both are legal. >> > >> > > > > >> > > > > - To add a location with the same value for the same >> variable, you >> > > > pass the >> > > > > token of the FIRST llvm.dbg.value, as this llvm.dbg.value's >> first >> > > > argument >> > > > > E.g. to add another location for the variable above: >> > > > > >> > > > > %second =3D call token @llvm.dbg.value(token %first, >> metadata >> > > > %val2, >> > > > > metadata !var, >> metadata >> > > > !expr2) >> > > > >> > > > Does this invalidate the first location, or does this add an >> additional >> > > > location >> > > > to the set of locations for var at this point? If I want to add a >> third >> > > > location, >> > > > which token do I pass in? Can you explain a bit more what >> information the >> > > > token >> > > > allows us to express that is currently not possible? >> > > > >> > > >> > > It adds a second location. If you want to add a third location you >> pass in >> > > the first token again. >> > > Thus the first call (key call) indicates a change of values, and all >> > > locations that have the same value should use the key call's token. >> > > >> > >> > Ok. Looks like this is going to be somewhat verbose for partial updates >> of SROA’ed aggregates as in the following example: >> > >> > // struct s { int i, j }; >> > // void foo(struct s) { s.j = 0; ... } >> > >> > define void @foo(i32 %i, i32 %j) { >> > %token = call llvm.dbg.value(token %undef, %i, !Struct, >> !DIExpression(DW_OP_bit_piece(0, 32))) >> > call llvm.dbg.value(token %token, %j, !Struct, >> !DIExpression(DW_OP_bit_piece(32, 32))) >> > ... >> > >> > ; have to repeat %i here: >> > %tok2 = call llvm.dbg.value(token %undef, %i, !Struct, >> !DIExpression(DW_OP_bit_piece(0, 32))) >> > call llvm.dbg.value(token %tok2, metadata i32 0, !Struct, >> !DIExpression(DW_OP_bit_piece(32, 32))) >> > >> > On the upside, having all this information explicit could simplify the >> code in DwarfDebug::buildLocationList(). >> > >> > Yeah, this is true. We could potentially extend the semantics by >> allowing separate key calls for pieces, i.e. >> > >> > %token = call llvm.dbg.value(token %undef, %i, !Struct, >> !DIExpression(DW_OP_bit_piece(0, 32))) >> > call llvm.dbg.value(token undef, %j, !Struct, >> !DIExpression(DW_OP_bit_piece(32, 32))) >> > >> > ; This now only invalidates the .j part >> > %tok2 = call llvm.dbg.value(token %undef, %j, !Struct, >> !DIExpression(DW_OP_bit_piece(32, 32))) >> > >> > In that case we would probably have to require that all >> DW_OP_bit_pieces in non-key-call expressions are a subrange of those in the >> associated key call. >> > >> > This way all non-key-call additional locations are describing >> alternative locations for (a subset of) the bits described the key-call >> location. Makes sense, and again would simplify the backend’s work. >> > >> > >> > Is there any information in the tokens that could not be recovered by a >> static analysis of the debug intrinsics? >> > Note that having redundant information available explicitly is not >> necessarily a bad thing. >> > >> > I am not entirely sure what you are proposing. You somehow need to be >> able to encode which dbg.values invalidate previous locations and which do >> not. Since we're describing front-end variables this will generally depend >> on front-end semantics, so I'm not sure what a generic analysis pass can do >> here without requiring language-specific analysis. >> > >> > Right. Determining whether two locations have equivalent contents is >> not generally decidable. >> > >> > The one difference I noticed so far is that alternative locations allow >> earlier locations to outlive locations that are dominated by them: >> > %loc = dbg.value(%undef, var, ...) >> > ... >> > %alt = dbg.value(%loc, var, ...) >> > ... >> > ; alt becomes unavailable >> > ... >> > ; %loc is still available here. >> > >> > Any other advantages that I missed? >> > >> > -- adrian >> > >> > >> > One thing I’m wondering about is whether we couldn’t design a >> friendlier (assembler) syntax for the three different use-cases: >> > %tok1 = call llvm.dbg.value(token %undef, %1, !var, !()) >> > %tok2 = call llvm.dbg.value(token %token, %2, !var, !()) >> > %tok3 = call llvm.dbg.value(token %tok1, %undef, !var, !()) >> > >> > Could be written as e.g.: >> > >> > %tok1 = call llvm.dbg.value.new(%1, !var, !()) >> > %tok2 = call llvm.dbg.value.add(token %token, %2, !var, !()) >> > %tok3 = call llvm.dbg.value.delete(token %tok1, !var, !()) >> > >> > -- adrian >> > _______________________________________________ >> > LLVM Developers mailing list >> > llvm-dev at lists.llvm.org >> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160115/f3ba2bef/attachment.html>