Davide Italiano via llvm-dev
2016-Nov-18 04:04 UTC
[llvm-dev] LLD: time to enable --threads by default
On Thu, Nov 17, 2016 at 7:34 PM, Rui Ueyama <ruiu at google.com> wrote:> On Thu, Nov 17, 2016 at 6:30 PM, Davide Italiano <davide at freebsd.org> wrote: >> >> On Thu, Nov 17, 2016 at 1:20 PM, Rafael Espíndola via llvm-dev >> <llvm-dev at lists.llvm.org> wrote: >> >> >> >> Thank you for the explanation! That makes sense. >> >> >> >> Unlike ThinLTO, each thread in LLD consumes very small amount of memory >> >> (probably just a few megabytes), so that's not a problem for me. At the >> >> final stage of linking, we spawn threads to copy section contents and >> >> apply >> >> relocations, and I guess that causes a lot of memory traffic because >> >> that's >> >> basically memcpy'ing input files to an output file, so the memory >> >> bandwidth >> >> could be a limiting factor there. But I do not see a reason to limit >> >> the >> >> number of threads to the number of physical core. For LLD, it seems >> >> like we >> >> can just spawn as many threads as HT provides. >> > >> > >> > It is quite common for SMT to *not* be profitable. I did notice some >> > small wins by not using it. On an intel machine you can do a quick >> > check by running with half the threads since they always have 2x SMT. >> > >> >> I had the same experience. Ideally I would like to have a way to >> override the number of threads used by the linker. >> gold has a plethora of options for doing that, i.e. >> >> --thread-count COUNT Number of threads to use >> --thread-count-initial COUNT >> Number of threads to use in initial pass >> --thread-count-middle COUNT Number of threads to use in middle pass >> --thread-count-final COUNT Number of threads to use in final pass >> >> I don't think we need the full generality/flexibility of >> initial/middle/final, but --thread-count could be useful (at least for >> experimenting). The current interface of `parallel_for_each` doesn't >> allow to specify the number of threads to be run, so, assuming lld >> goes that route (it may not), that should be extended accordingly. > > > I agree that these options would be useful for testing, but I'm reluctant to > expose them as user options because I wish LLD would just work out of the > box without turning lots of knobs. >I share your view that lld should work fine out-the-box. I think an alternative is having the option as hidden, maybe. I consider the set of users tinkering with linker options not large, although there are some people who like to override/"tune" the linker anyway, so IMHO we should expose a sane default and let users decide if they care or not (a similar example is what we do for --thinlto-threads or --lto-partitions, even if in the last case we still have that set to 1 because it's not entirely clear what's a reasonable number). -- Davide "There are no solved problems; there are only problems that are more or less solved" -- Henri Poincare
Davide Italiano via llvm-dev
2016-Nov-18 04:09 UTC
[llvm-dev] LLD: time to enable --threads by default
On Thu, Nov 17, 2016 at 8:04 PM, Davide Italiano <davide at freebsd.org> wrote:> On Thu, Nov 17, 2016 at 7:34 PM, Rui Ueyama <ruiu at google.com> wrote: >> On Thu, Nov 17, 2016 at 6:30 PM, Davide Italiano <davide at freebsd.org> wrote: >>> >>> On Thu, Nov 17, 2016 at 1:20 PM, Rafael Espíndola via llvm-dev >>> <llvm-dev at lists.llvm.org> wrote: >>> >> >>> >> Thank you for the explanation! That makes sense. >>> >> >>> >> Unlike ThinLTO, each thread in LLD consumes very small amount of memory >>> >> (probably just a few megabytes), so that's not a problem for me. At the >>> >> final stage of linking, we spawn threads to copy section contents and >>> >> apply >>> >> relocations, and I guess that causes a lot of memory traffic because >>> >> that's >>> >> basically memcpy'ing input files to an output file, so the memory >>> >> bandwidth >>> >> could be a limiting factor there. But I do not see a reason to limit >>> >> the >>> >> number of threads to the number of physical core. For LLD, it seems >>> >> like we >>> >> can just spawn as many threads as HT provides. >>> > >>> > >>> > It is quite common for SMT to *not* be profitable. I did notice some >>> > small wins by not using it. On an intel machine you can do a quick >>> > check by running with half the threads since they always have 2x SMT. >>> > >>> >>> I had the same experience. Ideally I would like to have a way to >>> override the number of threads used by the linker. >>> gold has a plethora of options for doing that, i.e. >>> >>> --thread-count COUNT Number of threads to use >>> --thread-count-initial COUNT >>> Number of threads to use in initial pass >>> --thread-count-middle COUNT Number of threads to use in middle pass >>> --thread-count-final COUNT Number of threads to use in final pass >>> >>> I don't think we need the full generality/flexibility of >>> initial/middle/final, but --thread-count could be useful (at least for >>> experimenting). The current interface of `parallel_for_each` doesn't >>> allow to specify the number of threads to be run, so, assuming lld >>> goes that route (it may not), that should be extended accordingly. >> >> >> I agree that these options would be useful for testing, but I'm reluctant to >> expose them as user options because I wish LLD would just work out of the >> box without turning lots of knobs. >> > > I share your view that lld should work fine out-the-box. I think an alternative > is having the option as hidden, maybe. I consider the set of users > tinkering with linker options not large, although there are some > people who like to override/"tune" the linker anyway, so IMHO we > should expose a sane default and let users decide if they care or not > (a similar example is what we do for --thinlto-threads or > --lto-partitions, even if in the last case we still have that set to 1 > because it's not entirely clear what's a reasonable number). >I've seen a case where the linker was pinned to a specific subset of the CPUs and many linker invocations were launched in parallel. (actually, this is the only time when I've seen --threads for gold used). I personally don't expect this to be the common use-case, but it's not hard to imagine complex build systems adopting a similar strategy. -- Davide "There are no solved problems; there are only problems that are more or less solved" -- Henri Poincare
Rui Ueyama via llvm-dev
2016-Nov-18 17:25 UTC
[llvm-dev] LLD: time to enable --threads by default
Sure. If you want to add --thread-count (but not other options, such as --thread-count-initial), that's fine with me. On Thu, Nov 17, 2016 at 8:09 PM, Davide Italiano <davide at freebsd.org> wrote:> On Thu, Nov 17, 2016 at 8:04 PM, Davide Italiano <davide at freebsd.org> > wrote: > > On Thu, Nov 17, 2016 at 7:34 PM, Rui Ueyama <ruiu at google.com> wrote: > >> On Thu, Nov 17, 2016 at 6:30 PM, Davide Italiano <davide at freebsd.org> > wrote: > >>> > >>> On Thu, Nov 17, 2016 at 1:20 PM, Rafael Espíndola via llvm-dev > >>> <llvm-dev at lists.llvm.org> wrote: > >>> >> > >>> >> Thank you for the explanation! That makes sense. > >>> >> > >>> >> Unlike ThinLTO, each thread in LLD consumes very small amount of > memory > >>> >> (probably just a few megabytes), so that's not a problem for me. At > the > >>> >> final stage of linking, we spawn threads to copy section contents > and > >>> >> apply > >>> >> relocations, and I guess that causes a lot of memory traffic because > >>> >> that's > >>> >> basically memcpy'ing input files to an output file, so the memory > >>> >> bandwidth > >>> >> could be a limiting factor there. But I do not see a reason to limit > >>> >> the > >>> >> number of threads to the number of physical core. For LLD, it seems > >>> >> like we > >>> >> can just spawn as many threads as HT provides. > >>> > > >>> > > >>> > It is quite common for SMT to *not* be profitable. I did notice some > >>> > small wins by not using it. On an intel machine you can do a quick > >>> > check by running with half the threads since they always have 2x SMT. > >>> > > >>> > >>> I had the same experience. Ideally I would like to have a way to > >>> override the number of threads used by the linker. > >>> gold has a plethora of options for doing that, i.e. > >>> > >>> --thread-count COUNT Number of threads to use > >>> --thread-count-initial COUNT > >>> Number of threads to use in initial pass > >>> --thread-count-middle COUNT Number of threads to use in middle pass > >>> --thread-count-final COUNT Number of threads to use in final pass > >>> > >>> I don't think we need the full generality/flexibility of > >>> initial/middle/final, but --thread-count could be useful (at least for > >>> experimenting). The current interface of `parallel_for_each` doesn't > >>> allow to specify the number of threads to be run, so, assuming lld > >>> goes that route (it may not), that should be extended accordingly. > >> > >> > >> I agree that these options would be useful for testing, but I'm > reluctant to > >> expose them as user options because I wish LLD would just work out of > the > >> box without turning lots of knobs. > >> > > > > I share your view that lld should work fine out-the-box. I think an > alternative > > is having the option as hidden, maybe. I consider the set of users > > tinkering with linker options not large, although there are some > > people who like to override/"tune" the linker anyway, so IMHO we > > should expose a sane default and let users decide if they care or not > > (a similar example is what we do for --thinlto-threads or > > --lto-partitions, even if in the last case we still have that set to 1 > > because it's not entirely clear what's a reasonable number). > > > > I've seen a case where the linker was pinned to a specific subset of the > CPUs > and many linker invocations were launched in parallel. > (actually, this is the only time when I've seen --threads for gold used). > I personally don't expect this to be the common use-case, but it's not hard > to imagine complex build systems adopting a similar strategy. > > -- > Davide > > "There are no solved problems; there are only problems that are more > or less solved" -- Henri Poincare >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161118/7b85add3/attachment.html>