One problem that has been vexing me of late: It seems that whenever I run into a problem that requires debugging one of my programs in gdb, before I can do that I have to fix my frontend's broken generation of debugging info. The code that generates debugging information is quite fragile - you have to generate metadata for each of your files, classes, and functions, and do so without error, because if you do make a mistake, the only way you'll find out is because gdb refuses to debug your program. And as I work on the code, occasionally bugs creep in, either from my side or occasionally from the LLVM side. The problem is, that I don't always check if the debug information is valid, so several weeks can go by and I don't notice something broke. What is needed is some way to write a unit test for DWARF information, so that if I broke something I would notice it immediately and could either fix it or roll back. Unfortunately, the various DIDescriptor.Verify() methods are nowhere near strict enough - you can create completely nonsensical DIEs that still pass through Verify(). And even if the Verify() methods were 100% reliable, they only test whether the LLVM metadata is valid - they don't test whether the actual DWARF embedded in the final executable is correct. I suppose you could do something with dwarfdump -ka, although it would be better to have something that worked on all platforms. Even dwarfdump itself has different option syntax on Linux vs. OS X. And I don't think it's possible right now to generate code that passes through dwarfdump with zero error messages, or at least, I've never been able to figure out how to do it. I was thinking that since lldb needs to know how to interpret all this stuff anyway, perhaps there could be a way to use the same code to validate the debug information for an executable. I know lldb doesn't run on every platform yet, but I suspect that the parts of lldb which decode DWARF are fairly generic. -- -- Talin -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110316/a7d25b91/attachment.html>
On 17 March 2011 06:40, Talin <viridia at gmail.com> wrote:> The code that generates debugging information is quite fragile - you have to > generate metadata for each of your files, classes, and functions, and do so > without error, because if you do make a mistake, the only way you'll find > out is because gdb refuses to debug your program. And as I work on the code, > occasionally bugs creep in, either from my side or occasionally from the > LLVM side. The problem is, that I don't always check if the debug > information is valid, so several weeks can go by and I don't notice > something broke.Strongly agree.> What is needed is some way to write a unit test for DWARF information, so > that if I broke something I would notice it immediately and could either fix > it or roll back. Unfortunately, the various DIDescriptor.Verify() methods > are nowhere near strict enough - you can create completely nonsensical DIEs > that still pass through Verify(). And even if the Verify() methods were 100% > reliable, they only test whether the LLVM metadata is valid - they don't > test whether the actual DWARF embedded in the final executable is correct.Strongly agree. But I go further... I could help with the verification process (since it's much better to fail verification than to fail gdb testuite), but I don't know the design decisions being taken for debug information/metadata, and they change too frequently to dig the code to learn. There is no API documentation and the interface (IR metadata) docs are old and inaccurate. I'd say, in order of importance, the three things that need to be done ASAP are: 1. Stick to one representation and document it (like LangRef), so other people could help 2. Enhance Validate() methods to be extremely strict (like Module's), so it fails straight away 3. Create tests (unit and regression) and run them during check-all, so we don't regress The tests are last because it's much easier to catch an assertion than a silent codegen error. After the initial period, we iterate those three steps (and not less!) again and again, until debug information is good. I see the importance of changing the IR (as I've requested quite a few times) but I understand that it's better for every one to have a stable IR. Every new version can have a few changes, not necessarily backward compatible, but those also need to be documented beforehand (mailing list, blog, release notes). If we follow the three steps above in an iterative way, during every release, we can achieve stability AND feature completeness. But (IMHO), stability comes first. cheers, --renato
Talin, If there is a magic wand, I would be interested to know! DIDescriptor.Verify() is not suitable for you needs. It checks structure of encoded debug info after optimizer has modified the IR. Its main goal is inform Dwarf writer, at the end of code gen, which IR construct it should ignore. If you want to test code gen you have to link compiled code and run it regularly. That's what various build bots for llvm does. Same way, if you want to validate generated debug info you have to go through the debugger. That said, there is a new unit test harness available. All it needs is more unit tests... http://llvm.org/docs/TestingGuide.html#quickdebuginfotests - Devang On Mar 16, 2011, at 11:40 PM, Talin wrote:> One problem that has been vexing me of late: It seems that whenever I run into a problem that requires debugging one of my programs in gdb, before I can do that I have to fix my frontend's broken generation of debugging info. > > The code that generates debugging information is quite fragile - you have to generate metadata for each of your files, classes, and functions, and do so without error, because if you do make a mistake, the only way you'll find out is because gdb refuses to debug your program. And as I work on the code, occasionally bugs creep in, either from my side or occasionally from the LLVM side. The problem is, that I don't always check if the debug information is valid, so several weeks can go by and I don't notice something broke. > > What is needed is some way to write a unit test for DWARF information, so that if I broke something I would notice it immediately and could either fix it or roll back. Unfortunately, the various DIDescriptor.Verify() methods are nowhere near strict enough - you can create completely nonsensical DIEs that still pass through Verify(). And even if the Verify() methods were 100% reliable, they only test whether the LLVM metadata is valid - they don't test whether the actual DWARF embedded in the final executable is correct. > > I suppose you could do something with dwarfdump -ka, although it would be better to have something that worked on all platforms. Even dwarfdump itself has different option syntax on Linux vs. OS X. And I don't think it's possible right now to generate code that passes through dwarfdump with zero error messages, or at least, I've never been able to figure out how to do it. > > I was thinking that since lldb needs to know how to interpret all this stuff anyway, perhaps there could be a way to use the same code to validate the debug information for an executable. I know lldb doesn't run on every platform yet, but I suspect that the parts of lldb which decode DWARF are fairly generic. > > -- > -- Talin > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Renato, On Mar 17, 2011, at 3:25 AM, Renato Golin wrote:> could help with the verification process (since it's much better to > fail verification than to fail gdb testuite), but I don't know the > design decisions being taken for debug information/metadata, and they > change too frequently to dig the code to learn.I think you are mistaken here. I maintain and support debug info for two front ends (llvm-gcc and clang). Go ahead and check svn archives for last one year and see how many times I had to update llvm-gcc FE.> There is no API > documentation and the interface (IR metadata) docs are old and > inaccurate. > > I'd say, in order of importance, the three things that need to be done ASAP are: > > 1. Stick to one representation and document it (like LangRef), so > other people could helpIn last 5 or so llvm releases, encoded debug info representation in llvm IR has changed only once (using metadata, instead of global variables). All other changes are incremental *and* backward compatible. Regarding documentation, it is on my list. However, your argument has same disconnect as some one who looks at LangReg and says I do not know what exactly FE has to generate to produce a working program. Well, what you need is a How To Write a Front End document.> 2. Enhance Validate() methods to be extremely strict (like Module's), > so it fails straight awaySee my response regarding Verify().> 3. Create tests (unit and regression) and run them during check-all, > so we don't regressI have already mentioned debuginfo-tests at least once to you earlier.> > The tests are last because it's much easier to catch an assertion than > a silent codegen error. >- Devang
Could dwarfdump --verify be used to check the debug info? - Jan ________________________________ From: Devang Patel <dpatel at apple.com> To: Talin <viridia at gmail.com> Cc: LLVM Developers Mailing List <llvmdev at cs.uiuc.edu> Sent: Thu, March 17, 2011 9:41:10 AM Subject: Re: [LLVMdev] Writing unit tests for DWARF? Talin, If there is a magic wand, I would be interested to know! DIDescriptor.Verify() is not suitable for you needs. It checks structure of encoded debug info after optimizer has modified the IR. Its main goal is inform Dwarf writer, at the end of code gen, which IR construct it should ignore. If you want to test code gen you have to link compiled code and run it regularly. That's what various build bots for llvm does. Same way, if you want to validate generated debug info you have to go through the debugger. That said, there is a new unit test harness available. All it needs is more unit tests... http://llvm.org/docs/TestingGuide.html#quickdebuginfotests - Devang On Mar 16, 2011, at 11:40 PM, Talin wrote:> One problem that has been vexing me of late: It seems that whenever I run into >a problem that requires debugging one of my programs in gdb, before I can do >that I have to fix my frontend's broken generation of debugging info. > > The code that generates debugging information is quite fragile - you have to >generate metadata for each of your files, classes, and functions, and do so >without error, because if you do make a mistake, the only way you'll find out is >because gdb refuses to debug your program. And as I work on the code, >occasionally bugs creep in, either from my side or occasionally from the LLVM >side. The problem is, that I don't always check if the debug information is >valid, so several weeks can go by and I don't notice something broke. > > What is needed is some way to write a unit test for DWARF information, so that >if I broke something I would notice it immediately and could either fix it or >roll back. Unfortunately, the various DIDescriptor.Verify() methods are nowhere >near strict enough - you can create completely nonsensical DIEs that still pass >through Verify(). And even if the Verify() methods were 100% reliable, they only >test whether the LLVM metadata is valid - they don't test whether the actual >DWARF embedded in the final executable is correct. > > I suppose you could do something with dwarfdump -ka, although it would be >better to have something that worked on all platforms. Even dwarfdump itself has >different option syntax on Linux vs. OS X. And I don't think it's possible right >now to generate code that passes through dwarfdump with zero error messages, or >at least, I've never been able to figure out how to do it. > > I was thinking that since lldb needs to know how to interpret all this stuff >anyway, perhaps there could be a way to use the same code to validate the debug >information for an executable. I know lldb doesn't run on every platform yet, >but I suspect that the parts of lldb which decode DWARF are fairly generic. > > -- > -- Talin > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev_______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110317/fa19316b/attachment.html>