Stella Laurenzo via llvm-dev
2020-Oct-30 04:22 UTC
[llvm-dev] Contributing Bazel BUILD files similar to gn
On Thu, Oct 29, 2020, 8:19 PM Eric Astor via llvm-dev < llvm-dev at lists.llvm.org> wrote:> I *am* a Googler, though not directly involved with the teams that > maintain the internal LLVM build. I happen to be a big fan of Bazel - and > mostly build LLVM with the internal Bazel build, rather than the external > CMake, because the better caching and remote-build-farm support is such an > enormous help. (Also, I find the CMake build & build options kind of > impenetrable.) However, I'm writing this particular email on my personal > account, with personal resources, well past the close of business; my > Google hat is firmly on the shelf, and I'm speaking as just an individual > contributor. > > When I first started contributing to LLVM, *I was confused by the GN > build's existence*. I didn't understand who was supposed to maintain it, > whether I should use it, what the benefits were... you name it. > > I agree with some of the first comments on this thread. I'd suggest that > we set aside the question of contributing Bazel BUILD files into the LLVM > repository for now, and start by proposing a general policy around > alternate/unsupported build systems in relation to the main repository. (GN > can have an exception if needed.) The fact that the GN build is basically > working, and doesn't confuse too many people, is a data point - but going > from 1 alternate build system to 2 seems like a good point to pause and set > an actual set of constraints and goals. Eventually, someone may want a > third, and we should know what the guidelines are so we don't hash out the > decision from scratch again! > > I don't think I could draft the RFC in question - I don't have enough > experience with the community yet to judge what's really needed - but I'd > be glad to help out with it. The idea should be to minimize the cost (to > nearly zero) for both experienced LLVM contributors and new LLVM > contributors. A few requirements I'd suggest, mostly put together from this > thread: > > - CMake should be able to build (and test!) everything the alternate > build system can, at all times. > - There must be a clear group who want to maintain the alternate build > system. > - The alternate build system's files should be isolated in a separate > directory, with a README explaining that this is an alternate build system > for LLVM, maintained by its own smaller community - and is not supported by > the community at large. > - The alternate build system must have independent buildbots, which do > not email the larger community; people can opt into being emailed about > this. (And should, if they're contributing to it!) > - If the buildbots are red for an extended time, we should put out a > call for maintainers to fix the issues; if not answered in a reasonable > time, we shouldn't be afraid to delete the alternate build system. > > I *do *also see the argument for the git submodule approach. It looks > like a .gitmodules file would theoretically let a repository of Bazel BUILD > files specify exactly which LLVM commit it currently tracks - and you could > fetch the corresponding updates in both with a single command. I think that > addresses the main point I noticed brought up on this side of the argument. > Any RFC here probably needs to present pros & cons of both approaches. > We'll need to hash those out in general discussion before people start > looking for consensus, so people understand what they're deciding on. >Just one note on this... But first, I am also a googler, and while I use bazel a lot, I don't see it being any more than a niche anytime soon for a mainstream project such as LLVM that has a wide deployment base, many variants/layers, cross compilation, build/install splits, etc. Bazel just doesn't scale to the level of differentiation and customization that is exploited for this scale of an OSS project in the wild. It was born in a much less diverse environment and carries that legacy forward (and seems like it will continue to do so for the foreseeable future). And I say that as someone who likely has enough years of experience in it that I could probably bend it in those directions if it came down to it... But wouldn't consider it a valuable use of time.>From what I can tell, people are successful/happy using bazel when theirneeds are not so diverse, and when they value org-scale consistency and scalability of their eng teams. It's not the only way to get that, for sure. Just a way that some choose, and some of those also choose/need to take LLVM as a dependency. (I often find it too restrictive and choose differently myself) That interpretation would lead to an answer to "why would we do this?": because it would help those people who use both bazel and LLVM to have an easier time living at head with LLVM as a dependency. Most of those people didn't actively choose bazel, and are in the same kind of mode of trying to minimize their costs for a large piece of dev infra that isn't core to their business/mission... Same as LLVM with cmake. Google internally and Google aligned open source projects certainly fall into that category. I can't speak for others. As for the costs, I could go either way on whether this should live in the monorepo. Even segmented into its own directory, the argument regarding the cost of confusion/churn seems credible to me (even if the cost is deemed worth it, I do see it as a cost that has merit to consider). On to my note... One other cost to consider is that if we have this outside of the monorepo, and outside of the LLVM organization, we have a contribution barrier up which firmly entrenches this as a "Google thing", and I don't think that is a good thing for LLVM as a project... There will be a different committer pool, different policy enforcement (such as accepting Google's CLA), different comms channels, etc. Projects, both OSS and private, outside of Google do use both bazel and LLVM, and it would be best, in my opinion, if they could source and contribute all of the LLVM bits from the LLVM org, including second tier build support, where it exists (and we should clearly cordone this off as some kind of second tier). In my mind, the best outcomes here involve deciding on a least harmful place to maintain these second tier build setups, and my preference would be that they be aligned with the llvm community/org vs on an island. I don't have an opinion on whether this lands in the monorepo or a secondary repo for second tier build setups. But I would like to see one of those outcomes vs keeping this Google aligned/owned.> Best, > - Eric > > On Thu, Oct 29, 2020 at 10:40 PM Eric Christopher via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> >> >> On Thu, Oct 29, 2020 at 9:44 PM Johannes Doerfert via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >>> (see below) >>> >>> >>> On 10/28/20 6:18 PM, Geoffrey Martin-Noble via llvm-dev wrote: >>> > Hi all, >>> > >>> > tl;dr: We'd like to contribute Bazel BUILD files for LLVM and MLIR in >>> a >>> > side-directory in the monorepo, similar to the gn build. >>> > >>> > Some of us have been working on open-source Bazel BUILD files for the >>> LLVM >>> > Project. You may have seen us hanging out in the #build-systems >>> discord >>> > channel. As you may know, Google uses Bazel internally and has >>> maintained a >>> > Bazel BUILD of LLVM for years. Especially with the introduction of >>> MLIR, >>> > we've got more and more OSS projects with a Bazel BUILD depending on >>> LLVM >>> > (e.g. IREE <https://github.com/google/iree> and TensorFlow >>> > <https://github.com/tensorflow/tensorflow>). We're also not the only >>> ones >>> > using Bazel: e.g. PlaidML also has a Bazel BUILD of LLVM that they've >>> borrowed >>> > from TF >>> > < >>> https://github.com/plaidml/plaidml/blob/master/vendor/llvm/llvm.BUILD>. >>> > Each of these projects has to jump through some weird hoops to keep >>> their >>> > version of the Bazel BUILD files in sync with the code, which >>> requires some >>> > fragile combination of scripts and human intervention. Instead, we'd >>> like >>> > to move general-purpose Bazel BUILD files into the LLVM Project >>> monorepo. >>> > We expect to follow the model of the GN build where these will be >>> > maintained by interested contributors rather than expecting the >>> general >>> > community to maintain them. >>> > >>> > To facilitate and test this we've been developing a standalone >>> repository >>> > that just has the Bazel BUILD files. It symlinks together the >>> directory >>> > trees on top of a submodule as we would need in the monorepo to to >>> avoid >>> > in-tree BUILD files. The configuration is at >>> > https://github.com/google/llvm-bazel. We now have those in a good >>> place and >>> > think they would be useful upstream. >>> > >>> > # Details >>> > >>> > ## What >>> > >>> > Bazel BUILD files for the LLVM, MLIR, and Clang (PR out for review >>> > <https://github.com/google/llvm-bazel/pull/72>) subprojects, >>> potentially >>> > expanding to others, as needed. Basically everything currently at >>> > https://github.com/google/llvm-bazel. >>> > >>> > ## Where >>> > >>> > In https://github.com/google/llvm-bazel the BUILD files live in a >>> single >>> > directory tree matching the structure of the overall llvm-project >>> > directory. For users, @llvm-project is a single Bazel repository >>> > >>> <https://docs.bazel.build/versions/master/build-ref.html#repositories> >>> that >>> > includes both LLVM and MLIR subprojects. To maintain this structure, >>> we >>> > would probably want to put a `bazel` directory in the monorepo's utils >>> > directory <https://github.com/llvm/llvm-project/tree/master/utils>, >>> which >>> > currently only contains a directory for arcanist. This is different >>> from >>> > gn, which is under the LLVM subproject's utils directory >>> > <https://github.com/llvm/llvm-project/tree/master/llvm/utils/gn>. We >>> could >>> > similarly put the Bazel BUILD files under llvm/utils/bazel but have >>> them be >>> > for the entire llvm project (the subsets that are supported). This >>> seems >>> > like an odd structure to me, but I know that the CMake build for LLVM >>> > also builds >>> > the other subprojects >>> > >>> < >>> https://github.com/llvm/llvm-project/blob/529ac33197f6/llvm/tools/CMakeLists.txt#L34-L41 >>> >, >>> > so maybe this would be preferable. >>> > >>> > Alternatively we could split each subproject into a separate Bazel >>> > repository and put the Bazel build files under each subproject. I >>> think >>> > this fragments the configuration of the BUILD without much benefit. >>> > >>> > ## Configurations >>> > >>> > We currently have configurations for Linux GCC and Clang, MacOS GCC >>> and >>> > Clang, and Windows MSVC. Support for other configurations can be added >>> > as-desired, but supporting all possible LLVM build configurations is >>> not >>> > the goal. >>> > >>> > ## Support >>> > >>> > Support would be similar to the gn build. Contributors could >>> optionally >>> > update the Bazel BUILD files as part of their patches, but would be >>> under >>> > no obligation to do so. >>> > >>> > ## Preserving History >>> > >>> > I don't *think* the history of llvm-bazel is interesting enough to >>> try to >>> > merge it into the monorepo and I was planning to submit this as a >>> single >>> > patch, but please let me know if you disagree. >>> > >>> > ## Benefits to the community >>> > >>> > - >>> > >>> > Projects that depend on LLVM and use the Bazel build system can >>> avoid >>> > duplicating fragile effort. We'll spend more time contributing to >>> LLVM >>> > instead :-D >>> > - >>> > >>> > Bazel is stricter than CMake in many ways (e.g. it requires that >>> even >>> > header dependencies be declared) and can catch layering issues >>> very easily. >>> > There's even an optional layering_check feature we could turn on >>> if its use >>> > would benefit the community. (though currently the existing >>> problematic >>> > layering makes it a burden to maintain on our own). Even without >>> that >>> > additional check, as I've been keeping the Bazel build green, I've >>> found >>> > and fixed a number of layering issues in the past couple weeks >>> (e.g. >>> > https://reviews.llvm.org/rGb49787df9a >>> > <https://reviews.llvm.org/rGb49787df9a535f03761c340dca7ec3ec1155133d> >>> > and https://reviews.llvm.org/rGc17ae2916c >>> > <https://reviews.llvm.org/rGc17ae2916ccf45a0c1717bd5f11598cc4fff342a >>> >). >>> > >>> > >>> > Here's a patch <https://reviews.llvm.org/D90352> adding the Bazel >>> build >>> > system. It's basically just `cp -r llvm-bazel/llvm-bazel >>> > llvm-project/utils/bazel`. >>> >>> Doesn't the last paragraph mean all benefits derived from this can be >>> described either as: >>> (1) users do not need to clone the llvm-bazel git repo but get the >>> files in llvm-project, or >>> (2) "interested contributors" could send patches to llvm-project >>> instead of llvm-bazel to update the bazel build. >>> >>> >> Absolutely. This could happen. The main reason behind this is to make >> integating among a number of llvm based projects that use bazel (TF and >> TF-based projects primarily, though it sounds like FB's internal process >> would be helped as their system is similar to bazel). >> >> >>> TBH, I have no interest in using bazel nor anything against it being >>> merged per se. I just find it curious that we merge another build system >>> "at no cost" for the community (I think I picked that up in the thread >>> but I might have imagined the phrasing). I mean, there is always "a >>> cost"* so it boils down to determine if the benefit is worth it. >>> >>> >> As far as I can think the cost is... >> >> >>> >>> * i.a., people will assume we (=the LLVM community) maintain(s) a bazel >>> build, which can certainly be a benefit but also a cost", e.g., when >>> the build is not properly maintained, support is scarce, etc. and >>> emails come in complaining about it (not thinking of prior examples >>> here.) >>> >>> >> ... this. If the system becomes a source of problems or user complaints >> then I think it's absolutely reasonable to remove it. >> >> -eric >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201029/2ff0606c/attachment.html>
Neil Nelson via llvm-dev
2020-Oct-30 05:39 UTC
[llvm-dev] Contributing Bazel BUILD files similar to gn
Some good remarks Stella. What does /second tier/ mean? There are additional directories in the LLVM download such as flang, compiler-rt, openmp, but these do not seem to be second-tier though there may be a sense in which they are. Is the idea of second-tier that there will be additional directories or programs embedded in the existing LLVM directories not available for use to those without bazel? If that is the case, then what is the relevance of those contributions? It seems we are saying that if a contribution is relevant then either it is in the cmake build, making bazel superfluous to obtain a build, or it is in a bazel-only build. A cmake build would be required for the parts we have now and then an additional bazel build for the second-tier parts. There is talk of gn. I am not seeing gn installed here but am not aware it is required. Is it the case that whatever gn does, cmake does, or is it the case there is a necessary gn build sequence in LLVM somewhere? Neil Nelson On 10/29/20 10:22 PM, Stella Laurenzo via llvm-dev wrote:> On to my note... > > One other cost to consider is that if we have this outside of the > monorepo, and outside of the LLVM organization, we have a contribution > barrier up which firmly entrenches this as a "Google thing", and I > don't think that is a good thing for LLVM as a project... There will > be a different committer pool, different policy enforcement (such as > accepting Google's CLA), different comms channels, etc. Projects, both > OSS and private, outside of Google do use both bazel and LLVM, and it > would be best, in my opinion, if they could source and contribute all > of the LLVM bits from the LLVM org, including second tier build > support, where it exists (and we should clearly cordone this off as > some kind of second tier).-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201029/bec24f81/attachment.html>
Eric Christopher via llvm-dev
2020-Oct-30 05:58 UTC
[llvm-dev] Contributing Bazel BUILD files similar to gn
Hi Neil, To try to elaborate more on what's been said: cmake is the supported build system and it's what is used to build all llvm projects. Anything else may also build llvm projects but won't be required for them. There won't be any projects in the llvm tree that don't build with cmake. There is no change planned or proposed to change the default and supported build system for llvm. I'd actually be quite strongly against that for a few reasons - primarily that cmake as a meta build system allows us to meet developers where they are in their development environments. Relatedly gn and bazel would allow us to do that for a different set of developers, but they don't have the reach or capability of cmake and I don't expect them to. Hope this is helpful and feel free to let me know if you have any other questions :) -eric On Fri, Oct 30, 2020, 1:39 AM Neil Nelson via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Some good remarks Stella. > > What does *second tier* mean? > > There are additional directories in the LLVM download such as flang, > compiler-rt, openmp, but these do not seem to be second-tier though there > may be a sense in which they are. > > Is the idea of second-tier that there will be additional directories or > programs embedded in the existing LLVM directories not available for use to > those without bazel? If that is the case, then what is the relevance of > those contributions? > > It seems we are saying that if a contribution is relevant then either it > is in the cmake build, making bazel superfluous to obtain a build, or it is > in a bazel-only build. A cmake build would be required for the parts we > have now and then an additional bazel build for the second-tier parts. > > There is talk of gn. I am not seeing gn installed here but am not aware it > is required. Is it the case that whatever gn does, cmake does, or is it the > case there is a necessary gn build sequence in LLVM somewhere? > > Neil Nelson > On 10/29/20 10:22 PM, Stella Laurenzo via llvm-dev wrote: > > On to my note... > > One other cost to consider is that if we have this outside of the > monorepo, and outside of the LLVM organization, we have a contribution > barrier up which firmly entrenches this as a "Google thing", and I don't > think that is a good thing for LLVM as a project... There will be a > different committer pool, different policy enforcement (such as accepting > Google's CLA), different comms channels, etc. Projects, both OSS and > private, outside of Google do use both bazel and LLVM, and it would be > best, in my opinion, if they could source and contribute all of the LLVM > bits from the LLVM org, including second tier build support, where it > exists (and we should clearly cordone this off as some kind of second tier). > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201030/6f94e249/attachment.html>
Stella Laurenzo via llvm-dev
2020-Oct-30 06:03 UTC
[llvm-dev] Contributing Bazel BUILD files similar to gn
On Thu, Oct 29, 2020 at 10:39 PM Neil Nelson via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Some good remarks Stella. > > What does *second tier* mean? >It was mainly me reaching (probably badly) for a term to sum up what others have alluded to: - Has some level of community interest in consuming LLVM as a dep from the build setups that use it -- at least enough to commit to maintaining it in reasonably good working order. - Has no expectation of maintenance by those not associated with consuming it. - Is not distributed as part of LLVM releases. - Is not guaranteed to work at arbitrary LLVM commits. - Probably has a Discourse tag under an "Unsupported Build Systems" top-level or something. - Is advertised specifically as an unsupported way to take a dep on LLVM (but if you have problems, feel free to reach out to <insert discourse tag> for ad-hoc help). - Could be deleted at any time (but with notice given to users so they can take a copy with them). Practically, I've seen many build systems with some level of interop available with respect to their more established peers (i.e. New Build System A can source deps from Older Build System B), but often with an impedance mismatch that tends to make it suitable for relatively simple things and hard to maintain for more complicated. This is actually the case with Bazel (and based on the FB feedback, to some extent Buck) -- and can be severe enough to cause maintaining a mirror built natively for the build system worth it. This seems to impact people using "hermetic" build systems the worst because they like to have pure source deps on everything that they possibly can, and the usual escape hatches (install and point to headers/libs) don't work well for their setups for whatever reason. I don't think the LLVM Project should have any obligation to keep such things running, but providing some hosting and a place to live for what amounts to real people trying to use and interact (and often contribute) to the project seems like a pretty minimal thing to provide and it helps keep the community together, even though some members have chosen to live on weird islands and like to declare dependencies on all of their headers :) There are additional directories in the LLVM download such as flang,> compiler-rt, openmp, but these do not seem to be second-tier though there > may be a sense in which they are. > > Is the idea of second-tier that there will be additional directories or > programs embedded in the existing LLVM directories not available for use to > those without bazel? If that is the case, then what is the relevance of > those contributions? > > It seems we are saying that if a contribution is relevant then either it > is in the cmake build, making bazel superfluous to obtain a build, or it is > in a bazel-only build. A cmake build would be required for the parts we > have now and then an additional bazel build for the second-tier parts. > > There is talk of gn. I am not seeing gn installed here but am not aware it > is required. Is it the case that whatever gn does, cmake does, or is it the > case there is a necessary gn build sequence in LLVM somewhere? > > Neil Nelson > On 10/29/20 10:22 PM, Stella Laurenzo via llvm-dev wrote: > > On to my note... > > One other cost to consider is that if we have this outside of the > monorepo, and outside of the LLVM organization, we have a contribution > barrier up which firmly entrenches this as a "Google thing", and I don't > think that is a good thing for LLVM as a project... There will be a > different committer pool, different policy enforcement (such as accepting > Google's CLA), different comms channels, etc. Projects, both OSS and > private, outside of Google do use both bazel and LLVM, and it would be > best, in my opinion, if they could source and contribute all of the LLVM > bits from the LLVM org, including second tier build support, where it > exists (and we should clearly cordone this off as some kind of second tier). > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201029/a2382b53/attachment.html>