Djordje Todorovic via llvm-dev
2020-Sep-01 07:35 UTC
[llvm-dev] [RFC] [DebugInfo] Using DW_OP_entry_value within LLVM IR
Hi all, The debug entry values feature introduces new DWARF symbols (tags, attributes, operations) on caller (call site) as well as on callee side; and the intention is to improve debugging user experience by using the functionality (especially in “optimized” code by turning “<optimized_out>” values into real values). The call site information includes info about call itself (described with DW_TAG_call_site) with corresponding children representing function arguments at the call site (described with DW_TAG_call_site_params). The most interesting DWARF attribute for us (here) is DW_AT_call_value which contains a DWARF expression which represents a value of the parameter at the time of the call. For the context of this RFC, more relevant part of the feature is the callee side, and it refers to new DWARF operation - DW_OP_entry_value, used to indicate that in some situations we can use parameter’s entry value as a real value in the current frame. It relies on the call-site info provided, and the more DW_AT_call_value generated, the more debug location inputs using DW_OP_entry_value will be turned into real values. Current implementation in LLVM Currently in LLVM, we generate the DW_OP_entry_values *only* for unmodified parameters during the LiveDebugValues pass, for the places where the Code Generation truncated live range of the parameters. The potential of the functionality goes beyond this, and it means we should be able to use the entry values even for modified parameters iff the modification could be expressed in terms of its entry value. In addition, there are cases where we can express values of local variables in terms of some parameter’s entry-values (e.g. int local = param + 2;). Proposal The idea of this RFC is to introduce an idea/discussion of using the DW_OP_entry_value not only at the end of LLVM pipeline (within LiveDebugValues). There are cases it could be useful at IR level; i.e. for unused arguments (please take a look into https://reviews.llvm.org/D85012); I believe there are a lot of cases where an IR pass drops/cuts variable’s debug value info where an entry value can fall back on as a backup location. There could be multiple ways of implementation, but in general, we need to extend metadata describing the debug value to support/refer to entry value/backup value as well (and when primary location is lost, the value with DW_OP_entry_value becomes the primary one). One way could be extending of llvm.dbg.value with an additional operand as following: llvm.dbg.value(…, DIEntryValExpression(DW_OP_uconst, 5)) // DIEntryValExpression implicitly contains DW_OP_entry_value operation The bottom line is that the production of call-site side of the feature stays the same, but LLVM will have more freedom to generate more of DW_OP_entry_values operation on the callee side. Any thoughts on this? Best regards, Djordje -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200901/266dde12/attachment.html>
David Blaikie via llvm-dev
2020-Sep-01 07:51 UTC
[llvm-dev] [RFC] [DebugInfo] Using DW_OP_entry_value within LLVM IR
(+ a few other folks from Google interested in increased optimized debug info location nifo) I don't have much context for the variable location part of LLVM's DWARF handling - I've mostly been leaving that to other folks, so take anything I say here with a grain of salt. My thinking would be that dbg.values for variable locations, and dbg.values for "backup" entry_value locations would be basically separate - as though there were two variables. How this would be reflected in the IR, I'm not sure - maybe it's similar to what you're suggesting (perhaps you could show a more fleshed out example? even for a simple function "void f1(int i) { f2(); f3(i); f2(); }" or something. I guess I would've imagined maybe a way for the dbg.value to include an extra bit saying "I'm an entry value expression" - oh, but I see, there's no IR for it how to have an entry value ins the expression? Fair enough, yeah, either using just DW_OP_entry_value with a counted value being the function parameter, or some DWOP_LLVM_* with some more suitable semantics, sounds OK to me. But probably having a top-level bit on the dbg.value saying "this is a backup/entry_value based location" is probably useful too - mostly ignored by optimizations, they would apply all the same transformations to it to create new locations from old ones, etc. I guess it would mean a frontend or early pass would create two locations for every parameter? (backup/entry_value based (though that would be tricky to do up-front, since frontends don't do the dataflow analysis, they just create an alloca and let it be read/written to as needed - so maybe entry_vaule based locations would be created on the fly more often somehow), and direct) On Tue, Sep 1, 2020 at 12:35 AM Djordje Todorovic < Djordje.Todorovic at syrmia.com> wrote:> Hi all, > > > The debug entry values feature introduces new DWARF symbols (tags, > attributes, operations) on caller (call site) as well as on callee side; > and the intention is to improve debugging user experience by using the > functionality (especially in “optimized” code by turning “<optimized_out>” > values into real values). The call site information includes info about > call itself (described with DW_TAG_call_site) with corresponding children > representing function arguments at the call site (described with > DW_TAG_call_site_params). The most interesting DWARF attribute for us > (here) is DW_AT_call_value which contains a DWARF expression which > represents a value of the parameter at the time of the call. For the > context of this RFC, more relevant part of the feature is the callee side, > and it refers to new DWARF operation - DW_OP_entry_value, used to indicate > that in some situations we can use parameter’s entry value as a real value > in the current frame. It relies on the call-site info provided, and the > more DW_AT_call_value generated, the more debug location inputs using > DW_OP_entry_value will be turned into real values. > > > Current implementation in LLVM > > > > Currently in LLVM, we generate the DW_OP_entry_values *only* for > unmodified parameters during the LiveDebugValues pass, for the places where > the Code Generation truncated live range of the parameters. The potential > of the functionality goes beyond this, and it means we should be able to > use the entry values even for modified parameters iff the modification > could be expressed in terms of its entry value. In addition, there are > cases where we can express values of local variables in terms of some > parameter’s entry-values (e.g. int local = param + 2;). > > > Proposal > > > > The idea of this RFC is to introduce an idea/discussion of using the > DW_OP_entry_value not only at the end of LLVM pipeline (within > LiveDebugValues). There are cases it could be useful at IR level; i.e. for > unused arguments (please take a look into https://reviews.llvm.org/D85012); > I believe there are a lot of cases where an IR pass drops/cuts variable’s > debug value info where an entry value can fall back on as a backup > location. There could be multiple ways of implementation, but in general, > we need to extend metadata describing the debug value to support/refer to > entry value/backup value as well (and when primary location is lost, the > value with DW_OP_entry_value becomes the primary one). One way could be > extending of llvm.dbg.value with an additional operand as following: > > > * llvm.dbg.value(…, DIEntryValExpression(DW_OP_uconst, 5)) > // DIEntryValExpression implicitly contains DW_OP_entry_value operation * > > > The bottom line is that the production of call-site side of the feature > stays the same, but LLVM will have more freedom to generate more of > DW_OP_entry_values operation on the callee side. > > > Any thoughts on this? > > > Best regards, > > Djordje >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200901/9d0e2666/attachment.html>
David Stenberg via llvm-dev
2020-Sep-01 09:38 UTC
[llvm-dev] [RFC] [DebugInfo] Using DW_OP_entry_value within LLVM IR
Hi! On Tue, 2020-09-01 at 07:35 +0000, Djordje Todorovic wrote:> Hi all, > > The debug entry values feature introduces new DWARF symbols (tags, attributes, > operations) on caller (call site) as well as on callee side; and the intention > is to improve debugging user experience by using the functionality (especially > in “optimized” code by turning “<optimized_out>” values into real values). The > call site information includes info about call itself (described with > DW_TAG_call_site) with corresponding children representing function arguments > at the call site (described with DW_TAG_call_site_params). The most interesting > DWARF attribute for us (here) is DW_AT_call_value which contains a DWARF > expression which represents a value of the parameter at the time of the call. > For the context of this RFC, more relevant part of the feature is the callee > side, and it refers to new DWARF operation - DW_OP_entry_value, used to > indicate that in some situations we can use parameter’s entry value as a real > value in the current frame. It relies on the call-site info provided, and the > more DW_AT_call_value generated, the more debug location inputs using > DW_OP_entry_value will be turned into real values. > > Current implementation in LLVM > > Currently in LLVM, we generate the DW_OP_entry_values *only* for unmodified > parameters during the LiveDebugValues pass, for the places where the Code > Generation truncated live range of the parameters. The potential of the > functionality goes beyond this, and it means we should be able to use the entry > values even for modified parameters iff the modification could be expressed in > terms of its entry value. In addition, there are cases where we can express > values of local variables in terms of some parameter’s entry-values (e.g. int > local = param + 2;). > > Proposal > > The idea of this RFC is to introduce an idea/discussion of using the > DW_OP_entry_value not only at the end of LLVM pipeline (within > LiveDebugValues). There are cases it could be useful at IR level; i.e. for > unused arguments (please take a look into > https://protect2.fireeye.com/v1/url?k=16c671b9-4876ec21-16c63122-861fcb972bfc-e4488a7f57de3412&q=1&e=4f293e8b-6a1f-4a80-9de1-30399c7295a6&u=https%3A%2F%2Freviews.llvm.org%2FD85012 > ); I believe there are a lot of cases where an IR pass drops/cuts variable’s > debug value info where an entry value can fall back on as a backup location. > There could be multiple ways of implementation, but in general, we need to > extend metadata describing the debug value to support/refer to entry > value/backup value as well (and when primary location is lost, the value with > DW_OP_entry_value becomes the primary one). One way could be extending of > llvm.dbg.value with an additional operand as following: > > llvm.dbg.value(…, DIEntryValExpression(DW_OP_uconst, 5)) // > DIEntryValExpression implicitly contains DW_OP_entry_value operation > > The bottom line is that the production of call-site side of the feature stays > the same, but LLVM will have more freedom to generate more of > DW_OP_entry_values operation on the callee side. > > Any thoughts on this?I just want to add that I think it would neat if the entry values could map into multi-location dbg.values and DBG_VALUEs that are being proposed on this list. For example, if we have: int local = param1 + param2 + 123; I think it would be good if we would be able to to represent the four different permutations of the values of the parameters being available in the function or as entry values. I have not yet delved into the discussion about the multi-location debug values, so I don't have any proposals for how that could look. Best regards, David> Best regards, > Djordje
Djordje Todorovic via llvm-dev
2020-Sep-01 13:48 UTC
[llvm-dev] [RFC] [DebugInfo] Using DW_OP_entry_value within LLVM IR
Hi David, Thanks for your feedback. My thinking would be that dbg.values for variable locations, and dbg.values for "backup" entry_value locations would be basically separate - as though there were two variables. How this would be reflected in the IR, I'm not sure - maybe it's similar to what you're suggesting (perhaps you could show a more fleshed out example? even for a simple function "void f1(int i) { f2(); f3(i); f2(); }" or something. I guess I would've imagined maybe a way for the dbg.value to include an extra bit saying "I'm an entry value expression" - oh, but I see, there's no IR for it how to have an entry value ins the expression? Fair enough, yeah, either using just DW_OP_entry_value with a counted value being the function parameter, or some DWOP_LLVM_* with some more suitable semantics, sounds OK to me. But probably having a top-level bit on the dbg.value saying "this is a backup/entry_value based location" is probably useful too - mostly ignored by optimizations, they would apply all the same transformations to it to create new locations from old ones, etc. We have an LLVM-internal operation (DW_OP_LLVM_entry_value), but I think we might be needing something different/more complex (e.g. a flag that indicates it is an entry_value/backup; since it needs to coexist with the real value). An alternative could be a separate intrinsic llvm.dbg.entry_val(), but I think we all want to avoid extra Intrinsics if possible. I guess it would mean a frontend or early pass would create two locations for every parameter? (backup/entry_value based (though that would be tricky to do up-front, since frontends don't do the dataflow analysis, they just create an alloca and let it be read/written to as needed - so maybe entry_vaule based locations would be created on the fly more often somehow), and direct) I guess we'd need something like that, but "on-the-fly" model will be more acceptable. Or, a separate IR pass for that purpose, but it would introduce some extra overhead... Best regards, Djordje ________________________________ From: David Blaikie <dblaikie at gmail.com> Sent: Tuesday, September 1, 2020 9:51 AM To: Djordje Todorovic <Djordje.Todorovic at syrmia.com> Cc: LLVM Dev <llvm-dev at lists.llvm.org>; vsk at apple.com <vsk at apple.com>; aprantl at apple.com <aprantl at apple.com>; david.stenberg at ericsson.com <david.stenberg at ericsson.com>; paul.robinson at sony.com <paul.robinson at sony.com>; Jeremy Morse <jeremy.morse at sony.com>; asowda at cisco.com <asowda at cisco.com>; ibaev at cisco.com <ibaev at cisco.com>; Nikola Tesic <Nikola.Tesic at syrmia.com>; Petar Jovanovic <petar.jovanovic at syrmia.com>; Caroline Tice <cmtice at google.com>; Tobias Bosch <tbosch at google.com>; Fangrui Song <maskray at google.com> Subject: Re: [llvm-dev] [RFC] [DebugInfo] Using DW_OP_entry_value within LLVM IR (+ a few other folks from Google interested in increased optimized debug info location nifo) I don't have much context for the variable location part of LLVM's DWARF handling - I've mostly been leaving that to other folks, so take anything I say here with a grain of salt. My thinking would be that dbg.values for variable locations, and dbg.values for "backup" entry_value locations would be basically separate - as though there were two variables. How this would be reflected in the IR, I'm not sure - maybe it's similar to what you're suggesting (perhaps you could show a more fleshed out example? even for a simple function "void f1(int i) { f2(); f3(i); f2(); }" or something. I guess I would've imagined maybe a way for the dbg.value to include an extra bit saying "I'm an entry value expression" - oh, but I see, there's no IR for it how to have an entry value ins the expression? Fair enough, yeah, either using just DW_OP_entry_value with a counted value being the function parameter, or some DWOP_LLVM_* with some more suitable semantics, sounds OK to me. But probably having a top-level bit on the dbg.value saying "this is a backup/entry_value based location" is probably useful too - mostly ignored by optimizations, they would apply all the same transformations to it to create new locations from old ones, etc. I guess it would mean a frontend or early pass would create two locations for every parameter? (backup/entry_value based (though that would be tricky to do up-front, since frontends don't do the dataflow analysis, they just create an alloca and let it be read/written to as needed - so maybe entry_vaule based locations would be created on the fly more often somehow), and direct) On Tue, Sep 1, 2020 at 12:35 AM Djordje Todorovic <Djordje.Todorovic at syrmia.com<mailto:Djordje.Todorovic at syrmia.com>> wrote: Hi all, The debug entry values feature introduces new DWARF symbols (tags, attributes, operations) on caller (call site) as well as on callee side; and the intention is to improve debugging user experience by using the functionality (especially in “optimized” code by turning “<optimized_out>” values into real values). The call site information includes info about call itself (described with DW_TAG_call_site) with corresponding children representing function arguments at the call site (described with DW_TAG_call_site_params). The most interesting DWARF attribute for us (here) is DW_AT_call_value which contains a DWARF expression which represents a value of the parameter at the time of the call. For the context of this RFC, more relevant part of the feature is the callee side, and it refers to new DWARF operation - DW_OP_entry_value, used to indicate that in some situations we can use parameter’s entry value as a real value in the current frame. It relies on the call-site info provided, and the more DW_AT_call_value generated, the more debug location inputs using DW_OP_entry_value will be turned into real values. Current implementation in LLVM Currently in LLVM, we generate the DW_OP_entry_values *only* for unmodified parameters during the LiveDebugValues pass, for the places where the Code Generation truncated live range of the parameters. The potential of the functionality goes beyond this, and it means we should be able to use the entry values even for modified parameters iff the modification could be expressed in terms of its entry value. In addition, there are cases where we can express values of local variables in terms of some parameter’s entry-values (e.g. int local = param + 2;). Proposal The idea of this RFC is to introduce an idea/discussion of using the DW_OP_entry_value not only at the end of LLVM pipeline (within LiveDebugValues). There are cases it could be useful at IR level; i.e. for unused arguments (please take a look into https://reviews.llvm.org/D85012); I believe there are a lot of cases where an IR pass drops/cuts variable’s debug value info where an entry value can fall back on as a backup location. There could be multiple ways of implementation, but in general, we need to extend metadata describing the debug value to support/refer to entry value/backup value as well (and when primary location is lost, the value with DW_OP_entry_value becomes the primary one). One way could be extending of llvm.dbg.value with an additional operand as following: llvm.dbg.value(…, DIEntryValExpression(DW_OP_uconst, 5)) // DIEntryValExpression implicitly contains DW_OP_entry_value operation The bottom line is that the production of call-site side of the feature stays the same, but LLVM will have more freedom to generate more of DW_OP_entry_values operation on the callee side. Any thoughts on this? Best regards, Djordje -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200901/54242f7b/attachment.html>
Djordje Todorovic via llvm-dev
2020-Sep-01 13:54 UTC
[llvm-dev] [RFC] [DebugInfo] Using DW_OP_entry_value within LLVM IR
Hi David, Thanks for your comments! I just want to add that I think it would neat if the entry values could map into multi-location dbg.values and DBG_VALUEs that are being proposed on this list. For example, if we have: int local = param1 + param2 + 123; I think it would be good if we would be able to to represent the four different permutations of the values of the parameters being available in the function or as entry values. I have not yet delved into the discussion about the multi-location debug values, so I don't have any proposals for how that could look. I guess it can (somehow) be mapped into that. It is clear to me that the usage of the DBG_VALUE_LIST will be appropriate for the "Salvage Debug Info", but the idea of using the entry values on IR level is more general (not very localized), and there is the cause of potential complexity, since we need to carry that info throughout IR and use it as a backup. Best regards, Djordje ________________________________ From: David Stenberg <david.stenberg at ericsson.com> Sent: Tuesday, September 1, 2020 11:38 AM To: Djordje Todorovic <Djordje.Todorovic at syrmia.com>; llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org> Cc: paul.robinson at sony.com <paul.robinson at sony.com>; asowda at cisco.com <asowda at cisco.com>; jeremy.morse at sony.com <jeremy.morse at sony.com>; ibaev at cisco.com <ibaev at cisco.com>; vsk at apple.com <vsk at apple.com>; aprantl at apple.com <aprantl at apple.com>; Petar Jovanovic <petar.jovanovic at syrmia.com>; Nikola Tesic <Nikola.Tesic at syrmia.com>; dblaikie at gmail.com <dblaikie at gmail.com> Subject: Re: [llvm-dev] [RFC] [DebugInfo] Using DW_OP_entry_value within LLVM IR Hi! On Tue, 2020-09-01 at 07:35 +0000, Djordje Todorovic wrote:> Hi all, > > The debug entry values feature introduces new DWARF symbols (tags, attributes, > operations) on caller (call site) as well as on callee side; and the intention > is to improve debugging user experience by using the functionality (especially > in “optimized” code by turning “<optimized_out>” values into real values). The > call site information includes info about call itself (described with > DW_TAG_call_site) with corresponding children representing function arguments > at the call site (described with DW_TAG_call_site_params). The most interesting > DWARF attribute for us (here) is DW_AT_call_value which contains a DWARF > expression which represents a value of the parameter at the time of the call. > For the context of this RFC, more relevant part of the feature is the callee > side, and it refers to new DWARF operation - DW_OP_entry_value, used to > indicate that in some situations we can use parameter’s entry value as a real > value in the current frame. It relies on the call-site info provided, and the > more DW_AT_call_value generated, the more debug location inputs using > DW_OP_entry_value will be turned into real values. > > Current implementation in LLVM > > Currently in LLVM, we generate the DW_OP_entry_values *only* for unmodified > parameters during the LiveDebugValues pass, for the places where the Code > Generation truncated live range of the parameters. The potential of the > functionality goes beyond this, and it means we should be able to use the entry > values even for modified parameters iff the modification could be expressed in > terms of its entry value. In addition, there are cases where we can express > values of local variables in terms of some parameter’s entry-values (e.g. int > local = param + 2;). > > Proposal > > The idea of this RFC is to introduce an idea/discussion of using the > DW_OP_entry_value not only at the end of LLVM pipeline (within > LiveDebugValues). There are cases it could be useful at IR level; i.e. for > unused arguments (please take a look into > https://protect2.fireeye.com/v1/url?k=16c671b9-4876ec21-16c63122-861fcb972bfc-e4488a7f57de3412&q=1&e=4f293e8b-6a1f-4a80-9de1-30399c7295a6&u=https%3A%2F%2Freviews.llvm.org%2FD85012 > ); I believe there are a lot of cases where an IR pass drops/cuts variable’s > debug value info where an entry value can fall back on as a backup location. > There could be multiple ways of implementation, but in general, we need to > extend metadata describing the debug value to support/refer to entry > value/backup value as well (and when primary location is lost, the value with > DW_OP_entry_value becomes the primary one). One way could be extending of > llvm.dbg.value with an additional operand as following: > > llvm.dbg.value(…, DIEntryValExpression(DW_OP_uconst, 5)) // > DIEntryValExpression implicitly contains DW_OP_entry_value operation > > The bottom line is that the production of call-site side of the feature stays > the same, but LLVM will have more freedom to generate more of > DW_OP_entry_values operation on the callee side. > > Any thoughts on this?I just want to add that I think it would neat if the entry values could map into multi-location dbg.values and DBG_VALUEs that are being proposed on this list. For example, if we have: int local = param1 + param2 + 123; I think it would be good if we would be able to to represent the four different permutations of the values of the parameters being available in the function or as entry values. I have not yet delved into the discussion about the multi-location debug values, so I don't have any proposals for how that could look. Best regards, David> Best regards, > Djordje-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200901/c039ad28/attachment-0001.html>