Jonas Paulsson via llvm-dev
2018-May-08 06:29 UTC
[llvm-dev] DEBUG INFO: improve handling of DBG_VALUEs and DebugLocs in CodeGen
Hi, (Resent with proper subject line) I recently found myself in trouble because the crash I had disappeared with -g, so I could not debug the program. This happened because the optimizer did not remember to consider DBG_VALUEs instruction so it changed its behavior, and the bug went hiding. I then started discussing this onhttps://reviews.llvm.org/D45878, and since this is something that should be handled by all different machine passes, the immediate question is now what utility functions should be made available for common use. 1. A pass such as MachineSink.cpp must first of all remember to consider DBG_VALUEs when iterating over instructions, so that it does not e.g. "break if next instruction is not X" and therefore change its results with -g. 2. At a second priority, a) it should handle DBG_VALUEs when moving / erasing MachineInstrs. b) it should " // Merge or erase debug location to ensure consistent stepping in profilers and debuggers." (MachineSink.cpp). Well, this is in fact the first point here: I think this should be documented somewhere by a comment so that it is clear exactly how these things should be handled. My personal understanding right now is that 2 is not vital but should be done on a best-effort basis. 1 however is really bad if it happens (which it currently does). We are discussing where to place these utility functions, and if splice() should optionally be made to do this also. It seems that collectDebugValues() in MachineSink.cpp is a good starting point, but we probably want to do an even better search than just looking at the first def operand. It would be nice to get some general feedback from the community at this point so we know which direction to take... /Jonas
Adrian Prantl via llvm-dev
2018-May-08 16:55 UTC
[llvm-dev] DEBUG INFO: improve handling of DBG_VALUEs and DebugLocs in CodeGen
> On May 7, 2018, at 11:29 PM, Jonas Paulsson via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hi, (Resent with proper subject line) > > I recently found myself in trouble because the crash I had disappeared > with -g, so I could not debug the program. This happened because the > optimizer did not remember to consider DBG_VALUEs instruction so it > changed its behavior, and the bug went hiding.[Here are some (very high-level) thoughts.] As you suspect, that's clearly a bug. LLVM's ideal behavior is that the presence of debug informations shall not affect the contents of the .text section. As you also noticed, we're not quite there yet.> > I then started discussing this on https://reviews.llvm.org/D45878, and > since this is something that should be handled by all different machine > passes, the immediate question is now what utility functions should be > made available for common use. > > 1. A pass such as MachineSink.cpp must first of all remember to consider > DBG_VALUEs when iterating over instructions, so that it does not e.g. > "break if next instruction is not X" and therefore change its results > with -g.We started a similar discussion at the LLVM IR level very recently. As a result of this we added new iterators to that make it easy to skip of debug intrinsics when doing analysis passes...> > 2. At a second priority, > > a) it should handle DBG_VALUEs when moving / erasing MachineInstrs.... but, skipping them during analysis is not enough, they also must move when the value they are describing moves.> > b) it should " // Merge or erase debug location to ensure > consistent stepping in profilers and debuggers." (MachineSink.cpp). > > Well, this is in fact the first point here: I think this should be > documented somewhere by a comment so that it is clear exactly how these > things should be handled. My personal understanding right now is that 2 > is not vital but should be done on a best-effort basis. 1 however is > really bad if it happens (which it currently does).Loosing DBG_VALUEs is bad because local variables will become unavailable in the debugger. This is completely unacceptable when compiling unoptimized code. In optimized code the expectations are somewhat lower, but there is a lot of low-hanging fruit left to pick where LLVM clearly could do a better job of preserving DBG_VALUEs. Incorrect debug locations are also unacceptable in unoptimized code, since they will cause the debugger's single-stepping to behave unpredictably, thus destroying the debugging experience. In optimized code, the expectations are a bit lower, but *incorrect* debug locations are terrible since they will cause misleading crash traces and otherwise undebuggable code. Due to the nature of what an optimizing compiler does, debug information is not always salvageable, but the correct thing to do in this case is to remove the debug information (e.g., by inserting line: 0 locations) instead of providing incorrect or partially correct information.> > We are discussing where to place these utility functions, and if > splice() should optionally be made to do this also. It seems that > collectDebugValues() in MachineSink.cpp is a good starting point, but we > probably want to do an even better search than just looking at the first > def operand. > > It would be nice to get some general feedback from the community at this > point so we know which direction to take...Thank you for starting this discussion! We should do what we can to make easy for pass authors to do the right thing. We should also do what we can to force pass authors to have to thing about debug information and how to update it correctly, even if that is annoying :-) Would creating a MIR version of the debugify IR pass be helpful in finding more of the existing bugs? -- adrian> /Jonas > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Vedant Kumar via llvm-dev
2018-May-08 18:06 UTC
[llvm-dev] DEBUG INFO: improve handling of DBG_VALUEs and DebugLocs in CodeGen
> On May 8, 2018, at 9:55 AM, Adrian Prantl <aprantl at apple.com> wrote: > > > >> On May 7, 2018, at 11:29 PM, Jonas Paulsson via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> Hi, (Resent with proper subject line) >> >> I recently found myself in trouble because the crash I had disappeared >> with -g, so I could not debug the program. This happened because the >> optimizer did not remember to consider DBG_VALUEs instruction so it >> changed its behavior, and the bug went hiding. > > [Here are some (very high-level) thoughts.] > > As you suspect, that's clearly a bug. LLVM's ideal behavior is that the presence of debug informations shall not affect the contents of the .text section. As you also noticed, we're not quite there yet. > >> >> I then started discussing this on https://reviews.llvm.org/D45878, and >> since this is something that should be handled by all different machine >> passes, the immediate question is now what utility functions should be >> made available for common use. >> >> 1. A pass such as MachineSink.cpp must first of all remember to consider >> DBG_VALUEs when iterating over instructions, so that it does not e.g. >> "break if next instruction is not X" and therefore change its results >> with -g. > > We started a similar discussion at the LLVM IR level very recently. As a result of this we added new iterators to that make it easy to skip of debug intrinsics when doing analysis passes... > >> >> 2. At a second priority, >> >> a) it should handle DBG_VALUEs when moving / erasing MachineInstrs. > > ... but, skipping them during analysis is not enough, they also must move when the value they are describing moves.A starting point for diagnosing issues which arise when MI's aren't moved when needed might be a use-before-def verifier at the MIR level (related: https://reviews.llvm.org/D46100 <https://reviews.llvm.org/D46100>).>> b) it should " // Merge or erase debug location to ensure >> consistent stepping in profilers and debuggers." (MachineSink.cpp). >> >> Well, this is in fact the first point here: I think this should be >> documented somewhere by a comment so that it is clear exactly how these >> things should be handled. My personal understanding right now is that 2 >> is not vital but should be done on a best-effort basis. 1 however is >> really bad if it happens (which it currently does). > > Loosing DBG_VALUEs is bad because local variables will become unavailable in the debugger. This is completely unacceptable when compiling unoptimized code. In optimized code the expectations are somewhat lower, but there is a lot of low-hanging fruit left to pick where LLVM clearly could do a better job of preserving DBG_VALUEs. > > Incorrect debug locations are also unacceptable in unoptimized code, since they will cause the debugger's single-stepping to behave unpredictably, thus destroying the debugging experience. In optimized code, the expectations are a bit lower, but *incorrect* debug locations are terrible since they will cause misleading crash traces and otherwise undebuggable code. Due to the nature of what an optimizing compiler does, debug information is not always salvageable, but the correct thing to do in this case is to remove the debug information (e.g., by inserting line: 0 locations) instead of providing incorrect or partially correct information. > >> >> We are discussing where to place these utility functions, and if >> splice() should optionally be made to do this also.This sounds like a good idea to me.>> It seems that >> collectDebugValues() in MachineSink.cpp is a good starting point, but we >> probably want to do an even better search than just looking at the first >> def operand.thanks, vedant>> >> It would be nice to get some general feedback from the community at this >> point so we know which direction to take... > > Thank you for starting this discussion! We should do what we can to make easy for pass authors to do the right thing. We should also do what we can to force pass authors to have to thing about debug information and how to update it correctly, even if that is annoying :-) > > Would creating a MIR version of the debugify IR pass be helpful in finding more of the existing bugs? > > -- adrian > >> /Jonas >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180508/ad12f4cc/attachment.html>
Jonas Paulsson via llvm-dev
2018-May-18 13:49 UTC
[llvm-dev] DEBUG INFO: improve handling of DBG_VALUEs and DebugLocs in CodeGen
Thanks for the replies! What is the next step now? Refactor out some method(s) (perhaps from MachineSink) into utility functions so that they can be used elsewhere? Who wants to write a comment somewhere to summarize these points? /Jonas On 2018-05-08 18:55, Adrian Prantl wrote:> >> On May 7, 2018, at 11:29 PM, Jonas Paulsson via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> Hi, (Resent with proper subject line) >> >> I recently found myself in trouble because the crash I had disappeared >> with -g, so I could not debug the program. This happened because the >> optimizer did not remember to consider DBG_VALUEs instruction so it >> changed its behavior, and the bug went hiding. > [Here are some (very high-level) thoughts.] > > As you suspect, that's clearly a bug. LLVM's ideal behavior is that the presence of debug informations shall not affect the contents of the .text section. As you also noticed, we're not quite there yet. > >> I then started discussing this on https://reviews.llvm.org/D45878, and >> since this is something that should be handled by all different machine >> passes, the immediate question is now what utility functions should be >> made available for common use. >> >> 1. A pass such as MachineSink.cpp must first of all remember to consider >> DBG_VALUEs when iterating over instructions, so that it does not e.g. >> "break if next instruction is not X" and therefore change its results >> with -g. > We started a similar discussion at the LLVM IR level very recently. As a result of this we added new iterators to that make it easy to skip of debug intrinsics when doing analysis passes... > >> 2. At a second priority, >> >> a) it should handle DBG_VALUEs when moving / erasing MachineInstrs. > ... but, skipping them during analysis is not enough, they also must move when the value they are describing moves. > >> b) it should " // Merge or erase debug location to ensure >> consistent stepping in profilers and debuggers." (MachineSink.cpp). >> >> Well, this is in fact the first point here: I think this should be >> documented somewhere by a comment so that it is clear exactly how these >> things should be handled. My personal understanding right now is that 2 >> is not vital but should be done on a best-effort basis. 1 however is >> really bad if it happens (which it currently does). > Loosing DBG_VALUEs is bad because local variables will become unavailable in the debugger. This is completely unacceptable when compiling unoptimized code. In optimized code the expectations are somewhat lower, but there is a lot of low-hanging fruit left to pick where LLVM clearly could do a better job of preserving DBG_VALUEs. > > Incorrect debug locations are also unacceptable in unoptimized code, since they will cause the debugger's single-stepping to behave unpredictably, thus destroying the debugging experience. In optimized code, the expectations are a bit lower, but *incorrect* debug locations are terrible since they will cause misleading crash traces and otherwise undebuggable code. Due to the nature of what an optimizing compiler does, debug information is not always salvageable, but the correct thing to do in this case is to remove the debug information (e.g., by inserting line: 0 locations) instead of providing incorrect or partially correct information. > >> We are discussing where to place these utility functions, and if >> splice() should optionally be made to do this also. It seems that >> collectDebugValues() in MachineSink.cpp is a good starting point, but we >> probably want to do an even better search than just looking at the first >> def operand. >> >> It would be nice to get some general feedback from the community at this >> point so we know which direction to take... > Thank you for starting this discussion! We should do what we can to make easy for pass authors to do the right thing. We should also do what we can to force pass authors to have to thing about debug information and how to update it correctly, even if that is annoying :-) > > Would creating a MIR version of the debugify IR pass be helpful in finding more of the existing bugs? > > -- adrian > >> /Jonas >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reasonably Related Threads
- DEBUG INFO: improve handling of DBG_VALUEs and DebugLocs in CodeGen
- gberry@codeaurora.org
- Ideas for managing DBG_VALUE machine instructions.
- [RFC] DebugInfo: A different way of specifying variable locations post-isel
- [DebugInfo] A value-tracking variable location update