greened at obbligato.org (David A. Greene) writes:> Joachim Durchholz <jo at durchholz.org> writes: > >> On the reasons why make-based builds are slow, Peter Miller has some >> insight to offer: >> http://miller.emu.id.au/pmiller/books/rmch/ . >> I'm not sure how widely recognized that paper is. Maybe it's widely >> known and today's build times stem from other things than recursive make. > > The paper is widely recognized. Its lessons, unfortunatly, are not. > > Chris is absolutely on-target as to why the current build is slow. It's > slow because recursive make hides the parallelism. It hides the > parallelism because it hides the dependencies. There is no way to get > around that problem with a recursive make build system.You keep repeating that and I say that it is wrong. Can you mention a serialization point on the LLVM build caused by recursive make? (GenLibDeps is not such example, as described on a previous message.) [snip]
On Nov 1, 2011, at 4:33 PM, Óscar Fuentes wrote:> greened at obbligato.org (David A. Greene) writes: > >> Joachim Durchholz <jo at durchholz.org> writes: >> >>> On the reasons why make-based builds are slow, Peter Miller has some >>> insight to offer: >>> http://miller.emu.id.au/pmiller/books/rmch/ . >>> I'm not sure how widely recognized that paper is. Maybe it's widely >>> known and today's build times stem from other things than recursive make. >> >> The paper is widely recognized. Its lessons, unfortunatly, are not. >> >> Chris is absolutely on-target as to why the current build is slow. It's >> slow because recursive make hides the parallelism. It hides the >> parallelism because it hides the dependencies. There is no way to get >> around that problem with a recursive make build system. > > You keep repeating that and I say that it is wrong. Can you mention a > serialization point on the LLVM build caused by recursive make? > (GenLibDeps is not such example, as described on a previous message.)Any use of DIRS is a serialization point. For example, lib/Support -> lib/TableGen -> utils are all built in serial before anything else is. -Chris
Óscar Fuentes <ofv at wanadoo.es> writes:>> Chris is absolutely on-target as to why the current build is slow. It's >> slow because recursive make hides the parallelism. It hides the >> parallelism because it hides the dependencies. There is no way to get >> around that problem with a recursive make build system. > > You keep repeating that and I say that it is wrong. Can you mention a > serialization point on the LLVM build caused by recursive make? > (GenLibDeps is not such example, as described on a previous message.)The fact that the LLVM has to run through all of the directories, read Makefiles, check dependencies in each Makefile, etc. In essence, a recursive make adds implicit dependencies on all of the sub-Makefiles. Those Makefiles have to be processed before any real work can begin. That includes shell overhead, which can be significant. That's just one example. Another is the artificial barriers to work stealing that recursive make imposes. I believe that once a group of threads is assigned to do a sub-make, that group of threads is tied up until the sub-make is finished, which means there are always a few idle threads at the end of each sub-make. In a non-recursive build, idle threads can go work on something else immediately. This could probably be fixed via GNU make's -j pipe communication with sub-makes but I'm not sure it has been. -Dave
On Tue, Nov 01, 2011 at 06:46:15PM -0500, David A. Greene wrote:> The fact that the LLVM has to run through all of the directories, read > Makefiles, check dependencies in each Makefile, etc. In essence, a > recursive make adds implicit dependencies on all of the sub-Makefiles. > Those Makefiles have to be processed before any real work can begin. > That includes shell overhead, which can be significant.Makefiles don't include shell overhead. You have to parse the build rules at some point anyway. There are some systems that support more aggressive caching for that part (e.g. Ninja), but that's beside the point. Sub-directories do not have to introduce serialisation points. Just as a test case, I copied a number of single file tools (apply, asa, basename, bdes, biff, bthset, btpin, cal, cap_mkdb, cdplay, checknr) into a separate subdirectory in my NetBSD src tree. I added a trivial Makefile just listening those with SUBDIR and an include of bsd.subdir.mk. make without -j: 1.275s make -j4: 0.607s After a full build: make without -j: 0.11s make -j4: 0.073s That's on a mobile (dual core) i7 running at 1.2GHz. In short, there is no serialisation point here. The main objection to the "recursive make is harmful" article is that it complete ignores the advantage of such a setup. Small per-directory build rules are easier to understand and as a direct consequence easier to maintain. The second objection is that many of the perceived performance issues are a result of GNU make as mentioned elsewhere in the thread. Sure, there is overhead associated with creating ~370 processes. But if that is the bottle neck of your build system, you should move to a platform with less process creation overhead. Unlike e.g. the GCC build, LLVM scales well with the coarser granularity since there are no single build actions that requires as long as the rest of the build combined. Joerg
Chris Lattner <clattner at apple.com> writes:>> You keep repeating that and I say that it is wrong. Can you mention a >> serialization point on the LLVM build caused by recursive make? >> (GenLibDeps is not such example, as described on a previous message.) > > Any use of DIRS is a serialization point. For example, lib/Support -> > lib/TableGen -> utils are all built in serial before anything else is.AFAIU DIRS is used for lib/Support an lib/TableGen because the latter depends on the first. This is not relevant on static builds (or shared builds on Linux, IIRC) so, if the `make' scripts were smart enough, that case of DIRS could become something like PARALLEL_DIRS_IF_STATIC. That would make possible to compile the source files of both libraries at the same time. BTW, the cmake build did that (don't know if keeps doing it after the switch to explicit dependencies.) But you and Dave are right wrt recursive make hiding details of the build. A non-recursive make could compile the files on both libraries without introducing any hacks. My insistence on Dave being wrong comes from my experience with the cmake build. There, we turn DIRS on PARALLEL_DIRS (except when building shared libraries on OS/X) and the amount of parallelism is huge. Essentially, there are three phases: building Support&utils, building the rest of the libraries, and building the tools (the final two phases are intermingled to some extent.) I expect that even a 64core machine will be quite busy almost all the time.