Duncan P. N. Exon Smith
2015-Apr-01 02:10 UTC
[LLVMdev] [RFC] Setting preserve-bc-use-list-order=true by default
A while back I finished up some work [1] that Chad started to preserve use-list-order in bitcode [2], hidden behind an "experimental" option called `-preserve-bc-use-list-order`. I then added a similar `-preserve-ll-use-list-order` option for LLVM assembly [3]. [1]: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-July/074604.html [2]: https://llvm.org/bugs/show_bug.cgi?id=5680 [3]: https://llvm.org/bugs/show_bug.cgi?id=20515 I'd like to move both of these options out of "experimental" mode, and turn `-preserve-bc-use-list-order` on by default. I've attached a patch that does this. Why? === Use-list order affects the output of some LLVM passes. A typical example is a pass that walks through basic block predecessors. The use-list order is deterministic, but not preserved when serializing the LLVM IR to bitcode or assembly. In rare (but frustrating) cases, this makes it difficult to reproduce a crash or miscompile from an intermediate result. For example, consider running an LTO build and serializing between the LTO optimization pipeline and the LTO codegen pipeline. On SPEC, serializing to/from bitcode will change the output executable in 33 benchmarks. If you use `-preserve-bc-use-list-order`, all executables match. How? === Have a look at `BitcodeReader::ParseUseLists()` and `predictUseListOrder()` in `lib/Bitcode/Writer/ValueEnumerator.cpp`. Does it degrade smoothly if IR changes? ====================================== Yes. If a `Value`'s use-list has changed in size (e.g., because of bitcode upgrades), its use-list order record is ignored. What does it cost? ================= Manman collected a bunch of numbers, with `-preserve-bc-use-list-order` and `-preserve-ll-use-list-order` both turned on: - Time increase on LTO build time: negligible. - Filesize increase in bitcode files output by `clang -flto`: - clang: 6.8556% (sum when from 310412640 to 331693296 bytes). - Filesize increase in merged bitcode files (output of `ld -save-temps` when using `clang -flto` on Darwin). — SPEC: 5.24% to 23.4% increase in file size. — clang and related tools: 0.83% to 9.19% (details in filesize_clang.txt, attached). - Time increase of running `llvm-dis` on merged bitcode files: — 6.4% to 15% on SPEC benchmarks with running time >= 1 second. — 7.9% to 14.5% on clang with running time >= 1 second (details in dis_clang.txt, attached). - Time increase of running `llvm-as` on merged bitcode files: — 3.5% to 39% on SPEC benchmarks with running time >= 1 second. — 14.7% to 24.5% with running time >= 1 second (details in as_clang.txt, attached). Does it work? ============ Manman also ran `verify-uselistorder` (a use-list fuzzer that shuffles use-lists, serializes IR, compares use-lists, rinses, repeats), on the merged bitcode files above. After I fixed a bug in assembly serialization in r232200, they all come up clean. Why leave -preserve-ll-use-list-order=false by default? ====================================================== Firstly, our practise is to error on mismatches in LLVM assembly (unlike bitcode where we upgrade or drop old formats). Printing use-list directives by default would make it almost impossible to update an assembly file by hand without stripping out the directives. Secondly, use-list order is encoded as a "shuffle", which isn't human-friendly. I'm not convinced this adds value to textual IR except in rare cases where a developer actually wants to see the order, or wants to try to make a change without perturbing the use-lists. Nevertheless, the option works. An LLVM developer can turn it on and hack at the IR, and `llvm-as` will complain if the developer makes the `uselistorder` directives inconsistent. The developer can choose whether to delete the mismatched directives or to update them. -------------- next part -------------- A non-text attachment was scrubbed... Name: use-list-order.patch Type: application/octet-stream Size: 818 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150331/a84886c5/attachment.obj> -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: filesize_clang.txt URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150331/a84886c5/attachment.txt> -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: dis_clang.txt URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150331/a84886c5/attachment-0001.txt> -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: as_clang.txt URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150331/a84886c5/attachment-0002.txt>
Rafael Espíndola
2015-Apr-01 14:55 UTC
[LLVMdev] [RFC] Setting preserve-bc-use-list-order=true by default
This is fine by me. It would be nice to have an easier way to edit a .ll with order information, but that should not be a blocker. On 31 March 2015 at 22:10, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:> A while back I finished up some work [1] that Chad started to preserve > use-list-order in bitcode [2], hidden behind an "experimental" option > called `-preserve-bc-use-list-order`. I then added a similar > `-preserve-ll-use-list-order` option for LLVM assembly [3]. > > [1]: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-July/074604.html > [2]: https://llvm.org/bugs/show_bug.cgi?id=5680 > [3]: https://llvm.org/bugs/show_bug.cgi?id=20515 > > I'd like to move both of these options out of "experimental" mode, and > turn `-preserve-bc-use-list-order` on by default. I've attached a patch > that does this. > > Why? > ===> > Use-list order affects the output of some LLVM passes. A typical > example is a pass that walks through basic block predecessors. The > use-list order is deterministic, but not preserved when serializing the > LLVM IR to bitcode or assembly. > > In rare (but frustrating) cases, this makes it difficult to reproduce a > crash or miscompile from an intermediate result. > > For example, consider running an LTO build and serializing between the > LTO optimization pipeline and the LTO codegen pipeline. On SPEC, > serializing to/from bitcode will change the output executable in 33 > benchmarks. If you use `-preserve-bc-use-list-order`, all executables > match. > > How? > ===> > Have a look at `BitcodeReader::ParseUseLists()` and > `predictUseListOrder()` in `lib/Bitcode/Writer/ValueEnumerator.cpp`. > > Does it degrade smoothly if IR changes? > ======================================> > Yes. If a `Value`'s use-list has changed in size (e.g., because of > bitcode upgrades), its use-list order record is ignored. > > What does it cost? > =================> > Manman collected a bunch of numbers, with `-preserve-bc-use-list-order` > and `-preserve-ll-use-list-order` both turned on: > > - Time increase on LTO build time: negligible. > - Filesize increase in bitcode files output by `clang -flto`: > - clang: 6.8556% (sum when from 310412640 to 331693296 bytes). > - Filesize increase in merged bitcode files (output of > `ld -save-temps` when using `clang -flto` on Darwin). > — SPEC: 5.24% to 23.4% increase in file size. > — clang and related tools: 0.83% to 9.19% (details in > filesize_clang.txt, attached). > - Time increase of running `llvm-dis` on merged bitcode files: > — 6.4% to 15% on SPEC benchmarks with running time >= 1 second. > — 7.9% to 14.5% on clang with running time >= 1 second (details in > dis_clang.txt, attached). > - Time increase of running `llvm-as` on merged bitcode files: > — 3.5% to 39% on SPEC benchmarks with running time >= 1 second. > — 14.7% to 24.5% with running time >= 1 second (details in > as_clang.txt, attached). > > Does it work? > ============> > Manman also ran `verify-uselistorder` (a use-list fuzzer that shuffles > use-lists, serializes IR, compares use-lists, rinses, repeats), on the > merged bitcode files above. After I fixed a bug in assembly > serialization in r232200, they all come up clean. > > Why leave -preserve-ll-use-list-order=false by default? > ======================================================> > Firstly, our practise is to error on mismatches in LLVM assembly (unlike > bitcode where we upgrade or drop old formats). Printing use-list > directives by default would make it almost impossible to update an > assembly file by hand without stripping out the directives. > > Secondly, use-list order is encoded as a "shuffle", which isn't > human-friendly. I'm not convinced this adds value to textual IR except > in rare cases where a developer actually wants to see the order, or > wants to try to make a change without perturbing the use-lists. > > Nevertheless, the option works. An LLVM developer can turn it on and > hack at the IR, and `llvm-as` will complain if the developer makes the > `uselistorder` directives inconsistent. The developer can choose > whether to delete the mismatched directives or to update them. > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
Duncan P. N. Exon Smith
2015-Apr-09 17:45 UTC
[LLVMdev] [RFC] Setting preserve-bc-use-list-order=true by default
Committed in r234510.> On 2015-Apr-01, at 07:55, Rafael Espíndola <rafael.espindola at gmail.com> wrote: > > This is fine by me. > > It would be nice to have an easier way to edit a .ll with order > information, but that should not be a blocker. > > On 31 March 2015 at 22:10, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote: >> A while back I finished up some work [1] that Chad started to preserve >> use-list-order in bitcode [2], hidden behind an "experimental" option >> called `-preserve-bc-use-list-order`. I then added a similar >> `-preserve-ll-use-list-order` option for LLVM assembly [3]. >> >> [1]: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-July/074604.html >> [2]: https://llvm.org/bugs/show_bug.cgi?id=5680 >> [3]: https://llvm.org/bugs/show_bug.cgi?id=20515 >> >> I'd like to move both of these options out of "experimental" mode, and >> turn `-preserve-bc-use-list-order` on by default. I've attached a patch >> that does this. >> >> Why? >> ===>> >> Use-list order affects the output of some LLVM passes. A typical >> example is a pass that walks through basic block predecessors. The >> use-list order is deterministic, but not preserved when serializing the >> LLVM IR to bitcode or assembly. >> >> In rare (but frustrating) cases, this makes it difficult to reproduce a >> crash or miscompile from an intermediate result. >> >> For example, consider running an LTO build and serializing between the >> LTO optimization pipeline and the LTO codegen pipeline. On SPEC, >> serializing to/from bitcode will change the output executable in 33 >> benchmarks. If you use `-preserve-bc-use-list-order`, all executables >> match. >> >> How? >> ===>> >> Have a look at `BitcodeReader::ParseUseLists()` and >> `predictUseListOrder()` in `lib/Bitcode/Writer/ValueEnumerator.cpp`. >> >> Does it degrade smoothly if IR changes? >> ======================================>> >> Yes. If a `Value`'s use-list has changed in size (e.g., because of >> bitcode upgrades), its use-list order record is ignored. >> >> What does it cost? >> =================>> >> Manman collected a bunch of numbers, with `-preserve-bc-use-list-order` >> and `-preserve-ll-use-list-order` both turned on: >> >> - Time increase on LTO build time: negligible. >> - Filesize increase in bitcode files output by `clang -flto`: >> - clang: 6.8556% (sum when from 310412640 to 331693296 bytes). >> - Filesize increase in merged bitcode files (output of >> `ld -save-temps` when using `clang -flto` on Darwin). >> — SPEC: 5.24% to 23.4% increase in file size. >> — clang and related tools: 0.83% to 9.19% (details in >> filesize_clang.txt, attached). >> - Time increase of running `llvm-dis` on merged bitcode files: >> — 6.4% to 15% on SPEC benchmarks with running time >= 1 second. >> — 7.9% to 14.5% on clang with running time >= 1 second (details in >> dis_clang.txt, attached). >> - Time increase of running `llvm-as` on merged bitcode files: >> — 3.5% to 39% on SPEC benchmarks with running time >= 1 second. >> — 14.7% to 24.5% with running time >= 1 second (details in >> as_clang.txt, attached). >> >> Does it work? >> ============>> >> Manman also ran `verify-uselistorder` (a use-list fuzzer that shuffles >> use-lists, serializes IR, compares use-lists, rinses, repeats), on the >> merged bitcode files above. After I fixed a bug in assembly >> serialization in r232200, they all come up clean. >> >> Why leave -preserve-ll-use-list-order=false by default? >> ======================================================>> >> Firstly, our practise is to error on mismatches in LLVM assembly (unlike >> bitcode where we upgrade or drop old formats). Printing use-list >> directives by default would make it almost impossible to update an >> assembly file by hand without stripping out the directives. >> >> Secondly, use-list order is encoded as a "shuffle", which isn't >> human-friendly. I'm not convinced this adds value to textual IR except >> in rare cases where a developer actually wants to see the order, or >> wants to try to make a change without perturbing the use-lists. >> >> Nevertheless, the option works. An LLVM developer can turn it on and >> hack at the IR, and `llvm-as` will complain if the developer makes the >> `uselistorder` directives inconsistent. The developer can choose >> whether to delete the mismatched directives or to update them. >> >> >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
David Blaikie
2015-Apr-09 18:06 UTC
[LLVMdev] [RFC] Setting preserve-bc-use-list-order=true by default
Late to the party because I figured other people would chime in, but I'll have a go... On Tue, Mar 31, 2015 at 7:10 PM, Duncan P. N. Exon Smith < dexonsmith at apple.com> wrote:> A while back I finished up some work [1] that Chad started to preserve > use-list-order in bitcode [2], hidden behind an "experimental" option > called `-preserve-bc-use-list-order`. I then added a similar > `-preserve-ll-use-list-order` option for LLVM assembly [3]. > > [1]: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-July/074604.html > [2]: https://llvm.org/bugs/show_bug.cgi?id=5680 > [3]: https://llvm.org/bugs/show_bug.cgi?id=20515 > > I'd like to move both of these options out of "experimental" mode, and > turn `-preserve-bc-use-list-order` on by default. I've attached a patch > that does this. > > Why? > ===> > Use-list order affects the output of some LLVM passes. A typical > example is a pass that walks through basic block predecessors. The > use-list order is deterministic, but not preserved when serializing the > LLVM IR to bitcode or assembly. > > In rare (but frustrating) cases, this makes it difficult to reproduce a > crash or miscompile from an intermediate result. > > For example, consider running an LTO build and serializing between the > LTO optimization pipeline and the LTO codegen pipeline. On SPEC, > serializing to/from bitcode will change the output executable in 33 > benchmarks. If you use `-preserve-bc-use-list-order`, all executables > match. >Why do you need those to match? What's the penalty/problem when these don't match? Reproducibility for tests/bugs/etc seems important, but if it's any worse/better with a particular use list, that's a bug we should fix, right? Forcing a specific use list isn't going to fix those bugs, just make sure we make the same decision (good or bad) every time.> How? > ===> > Have a look at `BitcodeReader::ParseUseLists()` and > `predictUseListOrder()` in `lib/Bitcode/Writer/ValueEnumerator.cpp`. > > Does it degrade smoothly if IR changes? > ======================================> > Yes. If a `Value`'s use-list has changed in size (e.g., because of > bitcode upgrades), its use-list order record is ignored. > > What does it cost? > =================> > Manman collected a bunch of numbers, with `-preserve-bc-use-list-order` > and `-preserve-ll-use-list-order` both turned on: > > - Time increase on LTO build time: negligible. > - Filesize increase in bitcode files output by `clang -flto`: > - clang: 6.8556% (sum when from 310412640 to 331693296 bytes). > - Filesize increase in merged bitcode files (output of > `ld -save-temps` when using `clang -flto` on Darwin). > — SPEC: 5.24% to 23.4% increase in file size. > — clang and related tools: 0.83% to 9.19% (details in > filesize_clang.txt, attached). > - Time increase of running `llvm-dis` on merged bitcode files: > — 6.4% to 15% on SPEC benchmarks with running time >= 1 second. > — 7.9% to 14.5% on clang with running time >= 1 second (details in > dis_clang.txt, attached). > - Time increase of running `llvm-as` on merged bitcode files: > — 3.5% to 39% on SPEC benchmarks with running time >= 1 second. > — 14.7% to 24.5% with running time >= 1 second (details in > as_clang.txt, attached). >These seem like pretty big costs to pay (bitcode size is going to be particularly important to Google - big projects under LTO, limits on the total size of the inputs to the link step, etc). To the point above, it's not clear why we'd want to pay that cost. (I'm partly playing devil's advocate here - I realize reproducibility is really important for a bunch of reasons, though this particular reproducibility is a marginal one (compared to "run the compiler twice on the same input and get two different answers") but it seems like we've generally treated these issues as bugs and fixed the optimizations to be use-list-order independent in the past, no?)> > Does it work? > ============> > Manman also ran `verify-uselistorder` (a use-list fuzzer that shuffles > use-lists, serializes IR, compares use-lists, rinses, repeats), on the > merged bitcode files above. After I fixed a bug in assembly > serialization in r232200, they all come up clean. > > Why leave -preserve-ll-use-list-order=false by default? > ======================================================> > Firstly, our practise is to error on mismatches in LLVM assembly (unlike > bitcode where we upgrade or drop old formats). Printing use-list > directives by default would make it almost impossible to update an > assembly file by hand without stripping out the directives. > > Secondly, use-list order is encoded as a "shuffle", which isn't > human-friendly. I'm not convinced this adds value to textual IR except > in rare cases where a developer actually wants to see the order, or > wants to try to make a change without perturbing the use-lists. > > Nevertheless, the option works. An LLVM developer can turn it on and > hack at the IR, and `llvm-as` will complain if the developer makes the > `uselistorder` directives inconsistent. The developer can choose > whether to delete the mismatched directives or to update them. > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150409/e03ee67e/attachment.html>
Duncan P. N. Exon Smith
2015-Apr-09 19:37 UTC
[LLVMdev] [RFC] Setting preserve-bc-use-list-order=true by default
> On 2015-Apr-09, at 11:06, David Blaikie <dblaikie at gmail.com> wrote: > > Late to the party because I figured other people would chime in, but I'll have a go... > > On Tue, Mar 31, 2015 at 7:10 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote: > A while back I finished up some work [1] that Chad started to preserve > use-list-order in bitcode [2], hidden behind an "experimental" option > called `-preserve-bc-use-list-order`. I then added a similar > `-preserve-ll-use-list-order` option for LLVM assembly [3]. > > [1]: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-July/074604.html > [2]: https://llvm.org/bugs/show_bug.cgi?id=5680 > [3]: https://llvm.org/bugs/show_bug.cgi?id=20515 > > I'd like to move both of these options out of "experimental" mode, and > turn `-preserve-bc-use-list-order` on by default. I've attached a patch > that does this. > > Why? > ===> > Use-list order affects the output of some LLVM passes. A typical > example is a pass that walks through basic block predecessors. The > use-list order is deterministic, but not preserved when serializing the > LLVM IR to bitcode or assembly. > > In rare (but frustrating) cases, this makes it difficult to reproduce a > crash or miscompile from an intermediate result. > > For example, consider running an LTO build and serializing between the > LTO optimization pipeline and the LTO codegen pipeline. On SPEC, > serializing to/from bitcode will change the output executable in 33 > benchmarks. If you use `-preserve-bc-use-list-order`, all executables > match. > > Why do you need those to match? What's the penalty/problem when these don't match?You need these to match so that running a pass using `opt` or `llc` on a dumped bitcode file gets you the same result as running the pass directly. This sort of thing (dumping out temporary results and continuing from there) is useful when triaging bugs. The problem is that the bug may not reproduce at all when starting from the dumped bitcode.> Reproducibility for tests/bugs/etc seems important, but if it's any worse/better with a particular use list, that's a bug we should fix, right? Forcing a specific use list isn't going to fix those bugs, just make sure we make the same decision (good or bad) every time.I'm not sure there's consensus that this is a bug. It's not optimal, but it's not clear to me that it's invalid to depend on use-list order. In some cases, it may be a reasonable heuristic that improves compile time (I'm not arguing for that, I'm just not convinced otherwise).> What does it cost? > =================> > Manman collected a bunch of numbers, with `-preserve-bc-use-list-order` > and `-preserve-ll-use-list-order` both turned on: > > - Time increase on LTO build time: negligible. > - Filesize increase in bitcode files output by `clang -flto`: > - clang: 6.8556% (sum when from 310412640 to 331693296 bytes). > - Filesize increase in merged bitcode files (output of > `ld -save-temps` when using `clang -flto` on Darwin). > — SPEC: 5.24% to 23.4% increase in file size. > — clang and related tools: 0.83% to 9.19% (details in > filesize_clang.txt, attached). > - Time increase of running `llvm-dis` on merged bitcode files: > — 6.4% to 15% on SPEC benchmarks with running time >= 1 second. > — 7.9% to 14.5% on clang with running time >= 1 second (details in > dis_clang.txt, attached). > - Time increase of running `llvm-as` on merged bitcode files: > — 3.5% to 39% on SPEC benchmarks with running time >= 1 second. > — 14.7% to 24.5% with running time >= 1 second (details in > as_clang.txt, attached). > > These seem like pretty big costs to pay (bitcode size is going to be particularly important to Google - big projects under LTO, limits on the total size of the inputs to the link step, etc). To the point above, it's not clear why we'd want to pay that cost. (I'm partly playing devil's advocate here - I realize reproducibility is really important for a bunch of reasons, though this particular reproducibility is a marginal one (compared to "run the compiler twice on the same input and get two different answers") but it seems like we've generally treated these issues as bugs and fixed the optimizations to be use-list-order independent in the past, no?) >(FWIW, there's some ancient discussion in PR5680 about this.) I don't have a strong opinion on whether depending on use-list order should be considered a bug. However, it *is* a bug not to be able to roundtrip to bitcode and get the same end result. While it may be possible to remove the compiler's dependency on use-list order, no one has signed up to do the work, it's not clear what the compile time cost would be, and there isn't consensus that it's the right end goal. In the meantime, this fixes the bug. If/when that hypothetical work is done and validated we can turn this off by default if it makes sense to. In terms of LTO bitcode size: serializing use-list order isn't actually necessary/useful for the "normal" outputs of `clang -flto`. It's important for `clang -emit-llvm`, `clang [-flto] -save-temps`, and `<gold/libLTO> -save-temps`, but serialization between "compile" and "link" is a deterministic and reproducible step. A possible optimization would be to disable the option when writing `clang`'s (final) output file in `clang -flto`. Thoughts? Alternatively, some portions of the aforementioned hypothetical work might be low-hanging fruit. E.g., it might not be hard to validate that the use-list order of `ConstantInt`s doesn't affect output (and if it does, it would probably be easy to fix); at the same time, I imagine they account for a disproportionate percentage of the bitcode bloat (they have large use-lists that change frequently).