This is a recent desktop. Xubuntu 19.10 Compiling for 10.0.0 clang and llvm. See below. For this test, running 14 processors in a gui VM. The cores are hyperthreaded, processors are twice the cores, but all the cores before the run are showing negligible activity. compile_commands.json has 3022 entries. The ninja compile run lasted 7 minutes and 43 seconds with 99% all processor usage throughout. We then have 7*60+43 = 463 seconds. Compile seconds per compile line in compile_commands.json 463/3022 = 0.1532 seconds. Average compile time per processor would be about 14*0.1532 seconds. cmake -G Ninja -DLLVM_ENABLE_PROJECTS="clang;llvm" -DLLVM_USE_LINKER=lld -DCMAKE_BUILD_TYPE="Release" -DLLVM_TARGETS_TO_BUILD=X86 -DLLVM_ENABLE_LIBPFM=OFF -DRUN_HAVE_GNU_POSIX_REGEX=0 -DRUN_HAVE_THREAD_SAFETY_ATTRIBUTES=0 -Wno-dev ../llvm &> /home/nnelson/Documents/cmake.log ninja &> /home/nnelson/Documents/ninja.log Here are some useful pages from Threading Building Blocks. Task-Based Programming https://software.intel.com/en-us/node/506100 Appendix A Costs of Time Slicing https://software.intel.com/en-us/node/506127 The point When the number of compiles exceeds the number of cores such that all the cores are utilized, nothing is gained by trying to multi-thread the individual compiles. In fact, loading up the cores with more threads or tasks than there are cores will reduce compiling efficiency because of time slicing. And sequencing through more tasks than less when the cores are not overloaded will reduce compiling efficiency because more tasks have to be loaded and unloaded to the cores. Neil Nelson -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200301/d8a69443/attachment.html>
On Mar 1, 2020, at 1:29 AM, Neil Nelson via llvm-dev <llvm-dev at lists.llvm.org> wrote:> The point > > When the number of compiles exceeds the number of cores such that all the cores are utilized, nothing is gained by trying to multi-thread the individual compiles. In fact, loading up the cores with more threads or tasks than there are cores will reduce compiling efficiency because of time slicing. And sequencing through more tasks than less when the cores are not overloaded will reduce compiling efficiency because more tasks have to be loaded and unloaded to the cores.That makes a lot of sense Neil. Two counterpoints though: 1) In development situations, it is common to rebuild one file (the one you changed) without rebuilding everything. Speeding that up is useful. 2) LLVM is a library that is used for a lot more than just C compilers. Many of its use cases are not modeled by bulk “compile all the code” workloads like you describe. You’re right that multithreading isn’t a panacea, but in my opinion, it is important to be able to provide access to multicore compilation for use cases that do benefit from it. -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200302/f707cfb1/attachment.html>
In addition to what Chris said, there’s also the case of large TUs / Unity files<https://en.wikipedia.org/wiki/Single_Compilation_Unit>. Given that currently the compilation of a single TU is not multi-thread, you can get “spikes” as the one below, where only one core (out of many) is working. One minute wasted, when I have many cores available: [cid:image001.png at 01D5F09E.5ED95CA0] For this specific case, not having Unity files makes the build uniform (no spikes like the one above), but it takes 10x more time to compile (20k TUs and 25k .h files compiled, which reduce down to ~600 Unity). To fix the spike, you then have to resort to a iterative optimization algorithm running nightly to find the best trade-off between size of unity and build times. This complicates things further. The point being, +1 for multi-threading the compiler :-) Alex. De : llvm-dev <llvm-dev-bounces at lists.llvm.org> De la part de Chris Lattner via llvm-dev Envoyé : March 2, 2020 1:08 PM À : Neil Nelson <nnelson at infowest.com> Cc : llvm-dev at lists.llvm.org Objet : Re: [llvm-dev] Multi-Threading Compilers On Mar 1, 2020, at 1:29 AM, Neil Nelson via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: The point When the number of compiles exceeds the number of cores such that all the cores are utilized, nothing is gained by trying to multi-thread the individual compiles. In fact, loading up the cores with more threads or tasks than there are cores will reduce compiling efficiency because of time slicing. And sequencing through more tasks than less when the cores are not overloaded will reduce compiling efficiency because more tasks have to be loaded and unloaded to the cores. That makes a lot of sense Neil. Two counterpoints though: 1) In development situations, it is common to rebuild one file (the one you changed) without rebuilding everything. Speeding that up is useful. 2) LLVM is a library that is used for a lot more than just C compilers. Many of its use cases are not modeled by bulk “compile all the code” workloads like you describe. You’re right that multithreading isn’t a panacea, but in my opinion, it is important to be able to provide access to multicore compilation for use cases that do benefit from it. -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200302/8fba3cb3/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 262647 bytes Desc: image001.png URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200302/8fba3cb3/attachment-0001.png>
Very good. I just saw that Alex is making some good points. The TBB pages earlier are an excellent reference for multi-threading or for TBB multi-tasking. They illustrate issues that a good threading design should consider that are not commonly recognized when embarking on a that design. I would say the task scheduler is their most comprehensive approach. Clearly, I am not familiar enough with LLVM to be making any definite prescription and recommend your more knowledgeable judgment. But in the spirit of attempting a more complete perspective we may consider that a threading design for LLVM is not simple. And I suspect not nearly as simple, not saying that the current ideas are simple in that I do not understand them, as has been explored to this point. For example, you remarked there are use cases, and having a realistic appreciation for what use cases there may be is important, where a multi-threading compile would be useful. But then on that point, if we were to have threading on the compile to an object file, would we then overload the cores, make them slower, when using ninja to compile LLVM? How would the compile threading take into account this other use case, if that was something to consider? Do we encumber the LLVM design with threading such that the trade-off for that design against its use-case benefits is not well justified? My sense is that if threading was judged a reasonable goal that a preliminary design would be presented after some research that would initially complement non-threading goals, a general improvement that would yield a better track for threading later. This reflects the sense that a good threading design is not simple and will impact a significant portion of LLVM. And it could well be that that is what is happening now. How would threading impact future LLVM development? My sense is that in some reasonable portion of future development would be made more difficult. You mentioned that single object compiling would have a benefit. Here are some factors that could mitigate that benefit. Although core speeds are not expected to increase and remain in the 4-5 gigahertz rate, IPC (instructions per cycle) is expected to increase at a slow rate. Over time there will be a natural reduction in compile time for some benchmark compile. There is also a gain in compile time for small source files as against larger source files and some argument with regard to design could be made for some maximum source file size to reduce complexity that would also address keeping compile time down. Is there a reason to expect an increase in source file size that would increase compile time? Noting the LLVM compile test from my prior post gave an average compile time of 14*0.1532 = 2.1448 seconds, is compile time a significant or marginal issue? What would be the target for an average compile time? Neil Nelson On 3/2/20 11:07 AM, Chris Lattner wrote:> On Mar 1, 2020, at 1:29 AM, Neil Nelson via llvm-dev > <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> The point >> >> When the number of compiles exceeds the number of cores such that all >> the cores are utilized, nothing is gained by trying to multi-thread >> the individual compiles. In fact, loading up the cores with more >> threads or tasks than there are cores will reduce compiling >> efficiency because of time slicing. And sequencing through more tasks >> than less when the cores are not overloaded will reduce compiling >> efficiency because more tasks have to be loaded and unloaded to the >> cores. > > That makes a lot of sense Neil. Two counterpoints though: > > 1) In development situations, it is common to rebuild one file (the > one you changed) without rebuilding everything. Speeding that up is > useful. > > 2) LLVM is a library that is used for a lot more than just C > compilers. Many of its use cases are not modeled by bulk “compile all > the code” workloads like you describe. > > > You’re right that multithreading isn’t a panacea, but in my opinion, > it is important to be able to provide access to multicore compilation > for use cases that do benefit from it. > > -Chris >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200302/59ac70ba/attachment.html>