Zhizhou Yang via llvm-dev
2018-Sep-18 22:49 UTC
[llvm-dev] Pass and Transformation-level debugging in LLVM
Hi all, Debugging a mis-compilation is always time consuming. I recently did some attempt on bisecting bad pass for LLVM and would like to share some ideas about how do we make it work. And meanwhile, I would also encourage the community to make each pass more bisectable with help of DebugCounter. We have already got a very useful helper in LLVM for pass level bisection, which is OptBisect <https://llvm.org/docs/OptBisect.html>. Though it only works for legacy pass manager, Fedor proposed an on-going effort <https://reviews.llvm.org/D50923> to make similar things work for new pass manager. To bring it to the next level, DebugCounter <http://llvm.org/docs/ProgrammersManual.html#adding-debug-counters-to-aid-in-debugging-your-code> provides features for me to have an in-pass (transformation) level limit to tell which transform in the pass exactly caused the error. When we set StopAfter value for a DebugCounter, it will eventually stop there as a limit. And in D50031 <https://reviews.llvm.org/D50031> and rL337748 <https://reviews.llvm.org/rL337748>, I added a method to print DebugCounter info: the `-print-debug-counter` flag. With this, writing a transformation level bisection script will be more straightforward. The issue we face is that the transformation level bisection can only work in passes with DebugCounters, and very few passes have these today. DebugCounter is also very useful even a pass author debugs manually without special bisection tooling. So I would encourage the community to add DebugCounter to your own passes to make life easier for debugging. Adding DebugCounters isn’t often too difficult. For example, I have several patches to add DebugCounter into passes: D50092 <https://reviews.llvm.org/D50092>, D50033 <https://reviews.llvm.org/D50033>. I have already built a bisection tool to help Android toolchain debug on our side. So I will say that the bisecting idea with OptBisect and DebugCounter helps us save time while debugging mis-compilations. Feel free to post if you have other ideas on this and hope that this thread can help. Thanks, Zhizhou -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180918/5b40141f/attachment.html>
David Greene via llvm-dev
2018-Sep-20 16:56 UTC
[llvm-dev] Pass and Transformation-level debugging in LLVM
Zhizhou Yang via llvm-dev <llvm-dev at lists.llvm.org> writes:> To bring it to the next level, DebugCounter provides features for me > to have an in-pass (transformation) level limit to tell which > transform in the pass exactly caused the error. When we set StopAfter > value for a DebugCounter, it will eventually stop there as a limit. > > And in D50031 and rL337748, I added a method to print DebugCounter > info: the `-print-debug-counter` flag. With this, writing a > transformation level bisection script will be more straightforward. > > I have already built a bisection tool to help Android toolchain debug > on our side. So I will say that the bisecting idea with OptBisect and > DebugCounter helps us save time while debugging mis-compilations.There is already a DebugCounter bisect tool in utils/bisect-skip-count. It is not documented, unfortunately. I had to figure it out by inspection, but you use it by including "%(skip)d" and "$(count)d" in the command you specify to bisect-skip-count. Then those values get filled in and your command should respond to them appropriately. For example: bisect-skip-count bisect-command.sh "%(skip)d" "%(count)d" 2>&1 | tee bisect.out bisect-command.sh presumably looks something like this: #!/bin/bash skip=$1 count=$2 opt --debug-counter=my-counter-skip=${skip},my-counter-count=${count} ... I recently used bisect-skip-count in this way very successfully to track down an aliasing bug deep in the machine scheduler. I'm working on documenting bisect-skip-count so people know about it. I can add comments to the script but I haven't looked at updating web page sources yet. I was thinking of adding something to the existing opt-bisect page. Guidance here would be helpful. I agree that anyone who adds DebugCounters should propose those changes on Phabricator. We can incrementally improve the debuggability of LLVM with such a process. -David
Zhizhou Yang via llvm-dev
2018-Sep-26 20:36 UTC
[llvm-dev] Pass and Transformation-level debugging in LLVM
I read this tool and I believe this could do a perfect job as a general bisecting script using DebugCounter. Just one small nit that I noticed that you are using (1<<32) as a default upper bound. I have a patch that could print the total number of transformations in a single pass: https://reviews.llvm.org/D50031. I think it may be helpful to let the user know exactly how many transformations are there in the pass, so that they could have a high level idea of the position of bad transformation. On Thu, Sep 20, 2018 at 9:56 AM David Greene <dag at cray.com> wrote:> Zhizhou Yang via llvm-dev <llvm-dev at lists.llvm.org> writes: > > > To bring it to the next level, DebugCounter provides features for me > > to have an in-pass (transformation) level limit to tell which > > transform in the pass exactly caused the error. When we set StopAfter > > value for a DebugCounter, it will eventually stop there as a limit. > > > > And in D50031 and rL337748, I added a method to print DebugCounter > > info: the `-print-debug-counter` flag. With this, writing a > > transformation level bisection script will be more straightforward. > > > > I have already built a bisection tool to help Android toolchain debug > > on our side. So I will say that the bisecting idea with OptBisect and > > DebugCounter helps us save time while debugging mis-compilations. > > There is already a DebugCounter bisect tool in utils/bisect-skip-count. > > It is not documented, unfortunately. I had to figure it out by > inspection, but you use it by including "%(skip)d" and "$(count)d" in > the command you specify to bisect-skip-count. Then those values get > filled in and your command should respond to them appropriately. For > example: > > bisect-skip-count bisect-command.sh "%(skip)d" "%(count)d" 2>&1 | tee > bisect.out > > bisect-command.sh presumably looks something like this: > > #!/bin/bash > > skip=$1 > count=$2 > > opt --debug-counter=my-counter-skip=${skip},my-counter-count=${count} > ... > > I recently used bisect-skip-count in this way very successfully to track > down an aliasing bug deep in the machine scheduler. I'm working on > documenting bisect-skip-count so people know about it. I can add > comments to the script but I haven't looked at updating web page sources > yet. I was thinking of adding something to the existing opt-bisect > page. Guidance here would be helpful. > > I agree that anyone who adds DebugCounters should propose those changes > on Phabricator. We can incrementally improve the debuggability of LLVM > with such a process. > > -David >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180926/850029a3/attachment.html>