Gerolf Hoflehner via llvm-dev
2019-Jun-26 22:03 UTC
[llvm-dev] Representations of IR in the output of opt
I finally got back to this. It is a known and endemic issue that pops up from time to time. The issues I’m aware of so far are related to random sets being used where strict order is required. This may result in non-deterministic uselists issued by the bitcode/assembly writers. There is no great way to go about pro-active testing for this. Collecting the tests so far and running them as regression tests occasionally might serve as a feel better bandage. Neither can I think of good checks in a verifier. These bugs show up from time to time, disappear, show up again, and essentially any commit could expose them or make them disappear. The medicine to take may well be supplying deterministic implementations for DenseMap, SmallPtrSet, and probably DenseSet - and have current usage limited to cases where order is irrelevant, like in data flow analysis etc. I pushed back one fix for sccp, and will post one for adce later. Hopefully they will help in your case, but I doubt they are exhaustive. FWIW, there is one bright spot here: I have no (not yet…) example where incorrect code is generated. -Gerolf> On May 31, 2019, at 2:08 AM, Gerolf Hoflehner via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > There is a non-deterministic problem with the uselists. The code causing this is almost identical in the IR and the bc writer. In some invocations of opt the uselists are shuffled in others (same input) they are not. I haven’t nailed the root cause yet. It has the flavor of a stack memory corruption. > For a quick check that you see the same issue you could disable the shuffle code in the writers. > > Gerolf > > > Sent from my iPhone > >> On May 30, 2019, at 1:41 PM, Sébastien Michelland via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> Hello again, >> >>> It may be desirable to sort the table before writing the bitcode out, adding Peter to the thread for his opinion. >> >> Thanks for this! >> >> Now it seems I've been optimistic about this result. I have instrumented the test suite to check it on a wider amount of files and quickly discovered that it fails for larger optimization sequences. >> >> In particular, the default -O3 set in which I'm interested is not reproduced easily. I'm attaching a script that demonstrates this. >> >> It contains the extracted -O3 set in two groups, and checks that [opt -debug-pass=Arguments] reports the same sequences when called with -O3 and the individual arguments. If a file name is provided, it also checks that the outputs are the same (or in our case, different). >> >> Many real files fail to pass this test, for instance bilateral_grid.bc: >> >> <https://github.com/llvm/llvm-test-suite/blob/master/Bitcode/Benchmarks/Halide/bilateral_grid/bilateral_grid.bc> >> >> The diffs are very large even in text mode, and include lots of code. >> >> I'm puzzled again. Any clue on the behavior of opt is very welcome. :) >> >> Cheers, >> Sébastien Michelland >> <not-associative.sh> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Mehdi AMINI via llvm-dev
2019-Jun-28 05:31 UTC
[llvm-dev] Representations of IR in the output of opt
On Wed, Jun 26, 2019 at 3:04 PM Gerolf Hoflehner via llvm-dev < llvm-dev at lists.llvm.org> wrote:> > I finally got back to this. It is a known and endemic issue that pops up > from time to time. The issues I’m aware of so far are related to random > sets being used where strict order is required. This may result in > non-deterministic uselists issued by the bitcode/assembly writers. > > There is no great way to go about pro-active testing for this. Collecting > the tests so far and running them as regression tests occasionally might > serve as a feel better bandage. Neither can I think of good checks in a > verifier. These bugs show up from time to time, disappear, show up again, > and essentially any commit could expose them or make them disappear. The > medicine to take may well be supplying deterministic implementations for > DenseMap, SmallPtrSet, and probably DenseSet - and have current usage > limited to cases where order is irrelevant, like in data flow analysis etc. >Isn't LLVM_REVERSE_ITERATION intended to catch issues where we depend on the iteration order? I don't know if we have bots using this flag though. Would building something like LLVM itself with a host clang built with and without this flag and comparing the output be enough? -- Mehdi> > I pushed back one fix for sccp, and will post one for adce later. > Hopefully they will help in your case, but I doubt they are exhaustive. > > FWIW, there is one bright spot here: I have no (not yet…) example where > incorrect code is generated. > > -Gerolf > > > > > On May 31, 2019, at 2:08 AM, Gerolf Hoflehner via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > > > > There is a non-deterministic problem with the uselists. The code causing > this is almost identical in the IR and the bc writer. In some invocations > of opt the uselists are shuffled in others (same input) they are not. I > haven’t nailed the root cause yet. It has the flavor of a stack memory > corruption. > > For a quick check that you see the same issue you could disable the > shuffle code in the writers. > > > > Gerolf > > > > > > Sent from my iPhone > > > >> On May 30, 2019, at 1:41 PM, Sébastien Michelland via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> > >> Hello again, > >> > >>> It may be desirable to sort the table before writing the bitcode out, > adding Peter to the thread for his opinion. > >> > >> Thanks for this! > >> > >> Now it seems I've been optimistic about this result. I have > instrumented the test suite to check it on a wider amount of files and > quickly discovered that it fails for larger optimization sequences. > >> > >> In particular, the default -O3 set in which I'm interested is not > reproduced easily. I'm attaching a script that demonstrates this. > >> > >> It contains the extracted -O3 set in two groups, and checks that [opt > -debug-pass=Arguments] reports the same sequences when called with -O3 and > the individual arguments. If a file name is provided, it also checks that > the outputs are the same (or in our case, different). > >> > >> Many real files fail to pass this test, for instance bilateral_grid.bc: > >> > >> < > https://github.com/llvm/llvm-test-suite/blob/master/Bitcode/Benchmarks/Halide/bilateral_grid/bilateral_grid.bc > > > >> > >> The diffs are very large even in text mode, and include lots of code. > >> > >> I'm puzzled again. Any clue on the behavior of opt is very welcome. :) > >> > >> Cheers, > >> Sébastien Michelland > >> <not-associative.sh> > >> _______________________________________________ > >> LLVM Developers mailing list > >> llvm-dev at lists.llvm.org > >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190627/1070eb04/attachment.html>
Gerolf Hoflehner via llvm-dev
2019-Jun-28 06:03 UTC
[llvm-dev] Representations of IR in the output of opt
> On Jun 27, 2019, at 10:31 PM, Mehdi AMINI <joker.eph at gmail.com> wrote: > > > >> On Wed, Jun 26, 2019 at 3:04 PM Gerolf Hoflehner via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> I finally got back to this. It is a known and endemic issue that pops up from time to time. The issues I’m aware of so far are related to random sets being used where strict order is required. This may result in non-deterministic uselists issued by the bitcode/assembly writers. >> >> There is no great way to go about pro-active testing for this. Collecting the tests so far and running them as regression tests occasionally might serve as a feel better bandage. Neither can I think of good checks in a verifier. These bugs show up from time to time, disappear, show up again, and essentially any commit could expose them or make them disappear. The medicine to take may well be supplying deterministic implementations for DenseMap, SmallPtrSet, and probably DenseSet - and have current usage limited to cases where order is irrelevant, like in data flow analysis etc. > > Isn't LLVM_REVERSE_ITERATION intended to catch issues where we depend on the iteration order? I don't know if we have bots using this flag though. > Would building something like LLVM itself with a host clang built with and without this flag and comparing the output be enough?This is a good idea. You would want to check the usual test suites first. For debugging simplicity, but also to increase the probability of finding issues. The crux is that you usually need a random number of runs before you can spot the issue. A simple a/b I would not consider prove for the absence of the issue, even with the forward/reverse perturbation. But I could see a bot running many a/b tests to spit them out over time.> > > > -- > Mehdi > > >> >> I pushed back one fix for sccp, and will post one for adce later. Hopefully they will help in your case, but I doubt they are exhaustive. >> >> FWIW, there is one bright spot here: I have no (not yet…) example where incorrect code is generated. >> >> -Gerolf >> >> >> >> > On May 31, 2019, at 2:08 AM, Gerolf Hoflehner via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> > >> > >> > There is a non-deterministic problem with the uselists. The code causing this is almost identical in the IR and the bc writer. In some invocations of opt the uselists are shuffled in others (same input) they are not. I haven’t nailed the root cause yet. It has the flavor of a stack memory corruption. >> > For a quick check that you see the same issue you could disable the shuffle code in the writers. >> > >> > Gerolf >> > >> > >> > Sent from my iPhone >> > >> >> On May 30, 2019, at 1:41 PM, Sébastien Michelland via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> >> >> Hello again, >> >> >> >>> It may be desirable to sort the table before writing the bitcode out, adding Peter to the thread for his opinion. >> >> >> >> Thanks for this! >> >> >> >> Now it seems I've been optimistic about this result. I have instrumented the test suite to check it on a wider amount of files and quickly discovered that it fails for larger optimization sequences. >> >> >> >> In particular, the default -O3 set in which I'm interested is not reproduced easily. I'm attaching a script that demonstrates this. >> >> >> >> It contains the extracted -O3 set in two groups, and checks that [opt -debug-pass=Arguments] reports the same sequences when called with -O3 and the individual arguments. If a file name is provided, it also checks that the outputs are the same (or in our case, different). >> >> >> >> Many real files fail to pass this test, for instance bilateral_grid.bc: >> >> >> >> <https://github.com/llvm/llvm-test-suite/blob/master/Bitcode/Benchmarks/Halide/bilateral_grid/bilateral_grid.bc> >> >> >> >> The diffs are very large even in text mode, and include lots of code. >> >> >> >> I'm puzzled again. Any clue on the behavior of opt is very welcome. :) >> >> >> >> Cheers, >> >> Sébastien Michelland >> >> <not-associative.sh> >> >> _______________________________________________ >> >> LLVM Developers mailing list >> >> llvm-dev at lists.llvm.org >> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > _______________________________________________ >> > LLVM Developers mailing list >> > llvm-dev at lists.llvm.org >> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190627/5c3b35bb/attachment.html>