Grang, Mandeep Singh via llvm-dev
2017-Aug-29 18:45 UTC
[llvm-dev] Uncovering non-determinism in LLVM - An Update
Hi All, I wanted to share a couple of updates on the effort to uncover non-determinism in LLVM through reverse iteration. 1. Reverse iteration has now been enabled for DenseMap (reviews.llvm.org/D35043) 2. We have setup a nightly reverse iteration buildbot (lab.llvm.org:8011/builders/reverse-iteration). This builds all LLVM targets with reverse iteration ON and runs ninja check-all. Currently there are 14 unit test failures. Please feel free to fix these. Also currently, only I receive the nightly email notification for this buildbot run. My plan is to enable sending the nightly notifications to llvm-commits once all 14 failures have been resolved. Please let me know if the community wants the nightly notifications even with the failures. As a potential next step, I was thinking about bootstrapping this reverse iteration LLVM to compile itself. Not sure if it can uncover more bugs but maybe worth a shot. All comments/suggestions welcome. Thanks, Mandeep
David Blaikie via llvm-dev
2017-Aug-30 16:51 UTC
[llvm-dev] [cfe-dev] Uncovering non-determinism in LLVM - An Update
On Tue, Aug 29, 2017 at 11:45 AM Grang, Mandeep Singh via cfe-dev < cfe-dev at lists.llvm.org> wrote:> Hi All, > > I wanted to share a couple of updates on the effort to uncover > non-determinism in LLVM through reverse iteration. > > 1. Reverse iteration has now been enabled for DenseMap > (reviews.llvm.org/D35043) > > 2. We have setup a nightly reverse iteration buildbot > (lab.llvm.org:8011/builders/reverse-iteration). > This builds all LLVM targets with reverse iteration ON and runs ninja > check-all. Currently there are 14 unit test failures. Please feel free > to fix these. > > Also currently, only I receive the nightly email notification for this > buildbot run. My plan is to enable sending the nightly notifications to > llvm-commits once all 14 failures have been resolved. > Please let me know if the community wants the nightly notifications even > with the failures. > As a potential next step, I was thinking about bootstrapping this > reverse iteration LLVM to compile itself. Not sure if it can uncover > more bugs but maybe worth a shot. >To uncover bugs in this configuration, I believe you'd want/need a stage2/stage3 comparison which might be a bit tricky/expensive*, something like: build clang twice (reverse and forward enabled) then build (in one mode, doesn't matter which I think) clang or other release binaries (or even the whole release) from each of those and compare them bit-for-bit, they should be identical. * If you want other developers to act on bugs found, the buildbot needs to have a short blame list (this can be done on a slow buildbot by having multiple slaves/builders running in parallel) but preferably also a short cycle time (so failures are reported soon after they're created) - otherwise expect to do a lot of triage yourself (& possibly leave the emails only going to you - because they'll have too large blame lists/revision ranges and people won't find them actionable) & then probably following up on the specific commit you believe introduced the problem and either fixing it yourself or replying on the commits list to report it to the original contributor.> > All comments/suggestions welcome. > > Thanks, > Mandeep > _______________________________________________ > cfe-dev mailing list > cfe-dev at lists.llvm.org > lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <lists.llvm.org/pipermail/llvm-dev/attachments/20170830/5ac300b1/attachment.html>
Diana Picus via llvm-dev
2017-Aug-31 10:20 UTC
[llvm-dev] [cfe-dev] Uncovering non-determinism in LLVM - An Update
On 30 August 2017 at 18:51, David Blaikie via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > > On Tue, Aug 29, 2017 at 11:45 AM Grang, Mandeep Singh via cfe-dev > <cfe-dev at lists.llvm.org> wrote: >> >> Hi All, >> >> I wanted to share a couple of updates on the effort to uncover >> non-determinism in LLVM through reverse iteration. >> >> 1. Reverse iteration has now been enabled for DenseMap >> (reviews.llvm.org/D35043) >> >> 2. We have setup a nightly reverse iteration buildbot >> (lab.llvm.org:8011/builders/reverse-iteration). >> This builds all LLVM targets with reverse iteration ON and runs ninja >> check-all. Currently there are 14 unit test failures. Please feel free >> to fix these. >> >> Also currently, only I receive the nightly email notification for this >> buildbot run. My plan is to enable sending the nightly notifications to >> llvm-commits once all 14 failures have been resolved. >> Please let me know if the community wants the nightly notifications even >> with the failures. >> As a potential next step, I was thinking about bootstrapping this >> reverse iteration LLVM to compile itself. Not sure if it can uncover >> more bugs but maybe worth a shot. > > > To uncover bugs in this configuration, I believe you'd want/need a > stage2/stage3 comparison which might be a bit tricky/expensive*, something > like: > > build clang twice (reverse and forward enabled) then build (in one mode, > doesn't matter which I think) clang or other release binaries (or even the > whole release) from each of those and compare them bit-for-bit, they should > be identical. > > * If you want other developers to act on bugs found, the buildbot needs to > have a short blame list (this can be done on a slow buildbot by having > multiple slaves/builders running in parallel) but preferably also a short > cycle time (so failures are reported soon after they're created) - otherwise > expect to do a lot of triage yourself (& possibly leave the emails only > going to you - because they'll have too large blame lists/revision ranges > and people won't find them actionable) & then probably following up on the > specific commit you believe introduced the problem and either fixing it > yourself or replying on the commits list to report it to the original > contributor. >I agree with what David said here, but I just wanted to say that you shouldn't feel too discouraged because of it. As someone that occasionally has to bisect 5h+ worth of revisions, I can tell you that in time you'll often be able to just look at the revisions and spot the culprit, or maybe 2-3 candidates that have likely caused the issue. Given that this bot does something very specific, you can then probably just inspect the code and see what caused the problem (if the revision doesn't touch any containers, then it probably didn't cause the issue, right?). It's a lot easier when you have a revision range, so it obviously won't take as long to identify and fix as the initial failures that you are seeing now. Ultimately, it's up to you to decide how much effort you are willing / able to put into this. This kind of failures probably won't even occur that often in practice, but when they do I think it's important to find them and fix them. The best way to know for sure is to give it a try for a while and see how it goes. If you find that it's impractical, you can always revert to the current configuration.>> >> >> All comments/suggestions welcome. >> >> Thanks, >> Mandeep >> _______________________________________________ >> cfe-dev mailing list >> cfe-dev at lists.llvm.org >> lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >