Geoffrey Martin-Noble via llvm-dev
2020-Nov-17 03:01 UTC
[llvm-dev] RFC: Contributing Bazel BUILD files in the "peripheral" support tier
I previously <https://groups.google.com/g/llvm-dev/c/u07o3QREVUg/> proposed contributing Bazel build files to the LLVM monorepo, supported *only* by interested community members and not to interfere with or affect the existing CMake configuration. As part of that conversation, it became clear that the LLVM policies for more "peripheral" components were not clearly documented. We now have a shiny new Support Policy <http://llvm.org/docs/SupportPolicy.html>. Thank you, Renato for moving that forward. Please take a look at it, if you haven't already. I would now like to re-raise the proposal to contribute Bazel build files to the LLVM monorepo. I am starting a new thread, as the last one became rather fragmented. This build configuration would be added at the peripheral support level <http://llvm.org/docs/SupportPolicy.html#peripheral-tier> to a new `utils/bazel` directory. I've prepared a patch <https://reviews.llvm.org/D90352> of what I am proposing to add. It includes a README indicating the level of support. It is largely a port of the Bazel build files Google uses internally and has maintained for several years. This should have approximately the same impact on the community as the current GN build in `llvm/utils/gn` does today. That is, it should not affect anyone who doesn't care. I've commented on the specific requirements <http://llvm.org/docs/SupportPolicy.html#id2> listed in the support policy inline: Code in this tier must:>* Have a clear benefit for residing in the main repository, catering to an> active sub-community (upstream or downstream). >A number of projects build LLVM with Bazel (e.g. IREE <https://github.com/google/iree>, TensorFlow <https://github.com/tensorflow/tensorflow>, PlaidML <https://github.com/plaidml/plaidml/blob/master/vendor/llvm/llvm.BUILD>). Google also uses Bazel to build in its internal source repository. This includes a substantial number of developers and active contributors to LLVM. Adding this to the monorepo would provide a natural coordination point for these projects and avoid fragmentation (projects currently have their own copies of the BUILD files) or Google-centric governance (e.g. signing Google's CLA).> * Be actively maintained by such sub-community and have its problems > addressed in a timely manner. >We can commit to maintaining and addressing issues with the configuration. Google has maintained its internal version of this configuration for a few years.>Code in this tier must not:> * Break or invalidate core tier code or infrastructure. If that happens > accidentally, reverting functionality and working on the issues offline is > the only acceptable course of action. >There should be no interaction between the Bazel build configuration and any core code or infrastructure.> * Negatively affect development of core tier code, with the sub-community > involved responsible for making changes to address specific concerns. >This should not affect development of core tier components. One reason we propose adding this to the root utils/ directory instead of under llvm/utils (where GN is located) is to avoid unnecessarily sending messages to llvm-commits. Others have raised the concern that the existence of an alternative build system might lead to lack of maintenance for the CMake build system. Given that supporting CMake will remain a requirement and maintenance of a Bazel build system will continue to happen regardless, we do not expect any significant impact in this way.> * Negatively affect other peripheral tier code, with the sub-communities > involved tasked to resolve the issues, still making sure the solution > doesn’t break or invalidate the core tier. >Similarly, this should have no interaction with other components in the peripheral tier. We will address conflicts if they arise.> * Impose sub-optimal implementation strategies on core tier components as > a result of idiosyncrasies in the peripheral component. >We do not expect any negative constraints on normal development of core tiers. Bazel is stricter about layering, which may help quickly identify layering issues in incoming commits.> * Have build infrastructure that spams all developers about their > breakages. >Build infrastructure will be configured to only notify opted-in developers.> * Fall into disrepair. This is a reflection of lack of an active > sub-community and will result in removal.Build bots with accompanying status badges will be prominently linked from the README. Currently a Linux/Clang build bot exists and can be easily reconfigured after the code move. A windows build bot will be added soon.>Code in this tier should:> * Have infrastructure to test, whenever meaningful, with either no > warnings or notification contained within the sub-community. >* Have support and testing that scales with the complexity and resilience> of the component, with the bar for simple and gracefully-degrading > components (such as editor bindings) much lower than for complex components > that must remain fresh with HEAD (such as experimental back-ends or > alternative build systems). >Build bot coverage already exists and will be expanded as described above.> * Have a document making clear the status of implementation, level of > support available, who the sub-community is and, if applicable, roadmap for > inclusion into the core tier. >The patch includes a README that should make the support level clear. I am happy to add additional language or take additional steps to make that more clear (e.g. adding `unsupported` to the directory path).> * Be restricted to a specific directory or have a consistent pattern (ex. > unique file suffix), making it easy to remove when necessary. >All configuration is restricted to a single directory and should be trivial to remove. A number of people raised the question of "why not a separate repository". This is indeed possible: It's what we've done with https://github.com/google/llvm-bazel, which is currently used by https://github.com/google/iree. It is significantly more infrastructure, coordination, and complexity for something that is specifically a configuration for the LLVM project itself, not its own dependent or adjacent project. I believe this contribution will significantly improve the situation for downstream users that use Bazel while having minimal impact on the community at large. Thanks, Geoffrey -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201116/22a3b454/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3992 bytes Desc: S/MIME Cryptographic Signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201116/22a3b454/attachment.bin>
Tom Stellard via llvm-dev
2020-Nov-17 05:41 UTC
[llvm-dev] RFC: Contributing Bazel BUILD files in the "peripheral" support tier
On 11/16/20 10:01 PM, Geoffrey Martin-Noble via llvm-dev wrote:> I previously <https://groups.google.com/g/llvm-dev/c/u07o3QREVUg/ > > proposed contributing Bazel build files to the LLVM monorepo, > supported *only* by interested community members and not to interfere > with or affect the existing CMake configuration. As part of that > conversation, it became clear that the LLVM policies for more > "peripheral" components were not clearly documented. We now have a shiny > new Support Policy <http://llvm.org/docs/SupportPolicy.html>. Thank you, > Renato for moving that forward. Please take a look at it, if you haven't > already. > > I would now like to re-raise the proposal to contribute Bazel build > files to the LLVM monorepo. I am starting a new thread, as the last one > became rather fragmented. > > This build configuration would be added at the peripheral support level > <http://llvm.org/docs/SupportPolicy.html#peripheral-tier> to a new > `utils/bazel` directory. I've prepared a patch > <https://reviews.llvm.org/D90352> of what I am proposing to add. It > includes a README indicating the level of support. It is largely a port > of the Bazel build files Google uses internally and has maintained for > several years. > > This should have approximately the same impact on the community as the > current GN build in `llvm/utils/gn` does today. That is, it should not > affect anyone who doesn't care. >I want to push back on this a little bit, because having the code in tree does impact everyone, even people who don't care about it. It increases disk usage, commit traffic, checkout times, bugzilla / issue traffic, and CI builds to name a few things. There are costs to having this in tree, the question (as always) is do the benefits outweigh the costs? (More comments below).> I've commented on the specific requirements > <http://llvm.org/docs/SupportPolicy.html#id2> listed in the support > policy inline: > > Code in this tier must: > > * Have a clear benefit for residing in the main repository, catering > to an active sub-community (upstream or downstream). > > A number of projects build LLVM with Bazel (e.g. IREE > <https://github.com/google/iree>, TensorFlow > <https://github.com/tensorflow/tensorflow>, PlaidML > <https://github.com/plaidml/plaidml/blob/master/vendor/llvm/llvm.BUILD>). Google > also uses Bazel to build in its internal source repository. This > includes a substantial number of developers and active contributors to > LLVM. Adding this to the monorepo would provide a natural coordination > point for these projects and avoid fragmentation (projects currently > have their own copies of the BUILD files) or Google-centric governance > (e.g. signing Google's CLA). > > * Be actively maintained by such sub-community and have its problems > addressed in a timely manner. > > We can commit to maintaining and addressing issues with the > configuration. Google has maintained its internal version of this > configuration for a few years. > > Code in this tier must not: > * Break or invalidate core tier code or infrastructure. If that > happens accidentally, reverting functionality and working on the > issues offline is the only acceptable course of action. > > There should be no interaction between the Bazel build configuration and > any core code or infrastructure. > > * Negatively affect development of core tier code, with the > sub-community involved responsible for making changes to address > specific concerns. > > This should not affect development of core tier components. One reason > we propose adding this to the root utils/ directory instead of under > llvm/utils (where GN is located) is to avoid unnecessarily sending > messages to llvm-commits. Others have raised the concern that the > existence of an alternative build system might lead to lack of > maintenance for the CMake build system. Given that supporting CMake will > remain a requirement and maintenance of a Bazel build system will > continue to happen regardless, we do not expect any significant impact > in this way. > > * Negatively affect other peripheral tier code, with the > sub-communities involved tasked to resolve the issues, still making > sure the solution doesn’t break or invalidate the core tier. > > Similarly, this should have no interaction with other components in the > peripheral tier. We will address conflicts if they arise. > > * Impose sub-optimal implementation strategies on core tier > components as a result of idiosyncrasies in the peripheral component. > > We do not expect any negative constraints on normal development of core > tiers. Bazel is stricter about layering, which may help quickly identify > layering issues in incoming commits. > > * Have build infrastructure that spams all developers about their > breakages. > > Build infrastructure will be configured to only notify opted-in developers. > > * Fall into disrepair. This is a reflection of lack of an active > sub-community and will result in removal. > > Build bots with accompanying status badges will be prominently linked > from the README. Currently a Linux/Clang build bot exists and can be > easily reconfigured after the code move. A windows build bot will be > added soon. > > Code in this tier should: > * Have infrastructure to test, whenever meaningful, with either no > warnings or notification contained within the sub-community. > > * Have support and testing that scales with the complexity and > resilience of the component, with the bar for simple and > gracefully-degrading components (such as editor bindings) much lower > than for complex components that must remain fresh with HEAD (such > as experimental back-ends or alternative build systems). > > Build bot coverage already exists and will be expanded as described above. > > * Have a document making clear the status of implementation, level > of support available, who the sub-community is and, if applicable, > roadmap for inclusion into the core tier. > > The patch includes a README that should make the support level clear. I > am happy to add additional language or take additional steps to make > that more clear (e.g. adding `unsupported` to the directory path). > > * Be restricted to a specific directory or have a consistent pattern > (ex. unique file suffix), making it easy to remove when necessary. > > All configuration is restricted to a single directory and should be > trivial to remove. > > A number of people raised the question of "why not a separate > repository". This is indeed possible: It's what we've done with > https://github.com/google/llvm-bazel, which is currently used by > https://github.com/google/iree. It is significantly more infrastructure, > coordination, and complexity for something that is specifically a > configuration for the LLVM project itself, not its own dependent or > adjacent project. >Personally, I do not think we should have alternative build systems in tree. However, I still think you should try to propose this as a pitch. I would much rather this go through a fair process and land than for it to be rejected based on a contentious thread. Here is why I'm not convinced this should be in tree: To me it's not clear why having the build files in-tree is better than having a separate repo with an llvm-project sub-module. The in tree bazel files will be broken from time to time, since most developers will not be updating them, however, with the sub-module approach you can ensure that the build will always work by pinning the llvm-bazel repo to a known-working commit of llvm-project. Can you expand on the pros/cons of in-tree vs out-of-tree with sub-modules. Other concerns I have from reviewing the patch: * It looks like there is a build configuration for at least one external project (zlib) and possibly another (vulkan-headers?). Do we really want to have build configurations for non-LLVM projects in our tree? Is there any limit to the number of external projects that can and will be added? * There are 3 files (abi-breaking.h.cmake, config.h.cmake, llvm-config.h.cmake) that have been copied from the llvm tree into utils/bazel/, is there some way we can avoid carrying multiple copies of the same file in tree? * Similarly, there are some files that are normally generated at build time clang/Config/config.h, llvm/Config/config.h, llvm/Config/llvm-config.h that have been copied into utils/bazel. Is it really necessary to have these in tree? Especially since some of the templates, like llvm-config.h.cmake, are also in utils/bazel? * I still worry about the bazel files causing merging conflicts when backported to the stable branch. If these are added to tree, could we have a rule where commits to utils/bazel/ cannot include changes to other files? * If we have 2 alternative build systems in tree, what's the criteria for adding more? Do they just need to meet the requirements of the "peripheral support level" ? Can we continue to add new build systems with no limit? I still think this needs to be addressed. Expanding on this last point a little bit, this raises some larger questions about what code should be allowed in tree. Essentially what we have here is that a critical part of the LLVM project has been re-implemented and is now being asked to be included in tree alongside the original implementation (CMake). There are parts of the codebase where this would clearly not be OK (e.g. a re-implementation of one of the backends), but for build systems I think you can make a valid case to either have it or not to have it. And this is why I think it should be a pitch. In my opinion, these kinds of higher-level decisions are better made by review managers than by people on the mailing list. The other nice thing about a pitch is that we don't need to spend days arguing about this on the mailing list. You can take my feedback, think about it, and if you think there is some validity to what I have said, then all you need to do is update your proposal to address my concerns. And if not, then you can just move on to the next email. Thanks, Tom> I believe this contribution will significantly improve the situation for > downstream users that use Bazel while having minimal impact on the > community at large. > > Thanks, > Geoffrey > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >
Renato Golin via llvm-dev
2020-Nov-17 10:26 UTC
[llvm-dev] RFC: Contributing Bazel BUILD files in the "peripheral" support tier
Hi Geoffrey, Thanks for the re-submission. I have some comments below that may sound negative, but they're probably just a reflection of my own ignorance. I want to make sure the submission is clear, so it can be accepted on its own right. On Tue, 17 Nov 2020 at 03:02, Geoffrey Martin-Noble via llvm-dev < llvm-dev at lists.llvm.org> wrote:> This should not affect development of core tier components. One reason we > propose adding this to the root utils/ directory instead of under > llvm/utils (where GN is located) is to avoid unnecessarily sending messages > to llvm-commits. Others have raised the concern that the existence of an > alternative build system might lead to lack of maintenance for the CMake > build system. Given that supporting CMake will remain a requirement and > maintenance of a Bazel build system will continue to happen regardless, we > do not expect any significant impact in this way. >I was under the impression that "utils" was actually "llvm/utils", which would be in the same place as GN. I don't think we should treat GN and Bazel as different and I really wouldn't like to have a different quality control (for post commit reviews). If the Bazel commits are too verbose (for example, committing auto-generated code), then we should really clean that up and commit the script that generates them and make that part of the build. I understand the need to move the noise away, but move it too far away and it's no better than in a separate repo. A number of people raised the question of "why not a separate repository".> This is indeed possible: It's what we've done with > https://github.com/google/llvm-bazel, which is currently used by > https://github.com/google/iree. It is significantly more infrastructure, > coordination, and complexity for something that is specifically a > configuration for the LLVM project itself, not its own dependent or > adjacent project. >I was also under the impression that one of the big reasons why we needed it to be in LLVM is that, like CMake, it needed files all over the place. This would indeed be a major infrastructure undertaking. But given that it's all being hosted in a single directory, and outside of the LLVM tree, I really can't see what's so much harder about an extra checkout in the same tree. I believe this contribution will significantly improve the situation for> downstream users that use Bazel while having minimal impact on the > community at large. >It's not clear to me yet if LLVM/Bazel is only used in Google projects or any other non-Google project. All that you listed so far seem to be exclusive to Google. This is not a problem per se, but it does promote the idea that Google could common it up internally instead. The main reasons why it would be upstream are that it's either a product by or requirement to the project itself, or it helps unite cross-industry collaboration that wouldn't be possible otherwise. It's clearly not the former (and why it's in the periphery tier), but it's also not clear it's in the latter either. cheers, --renato -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201117/614bfa33/attachment.html>
Renato Golin via llvm-dev
2020-Nov-17 10:52 UTC
[llvm-dev] RFC: Contributing Bazel BUILD files in the "peripheral" support tier
On Tue, 17 Nov 2020 at 05:41, Tom Stellard via llvm-dev < llvm-dev at lists.llvm.org> wrote:> * I still worry about the bazel files causing merging conflicts when > backported to the stable branch. If these are added to tree, could we > have a rule where commits to utils/bazel/ cannot include changes to > other files? >That's usually the policy for other non-code changes, but I agree that with build system components, this is specially important. Expanding on this last point a little bit, this raises some larger>questions about what code should be allowed in tree. Essentially what> we have here is that a critical part of the LLVM project has been > re-implemented and is now being asked to be included in tree alongside > the original implementation (CMake). There are parts of the codebase > where this would clearly not be OK (e.g. a re-implementation of one of > the backends), but for build systems I think you can make a valid case > to either have it or not to have it. >We have examples of competing front-ends (the three different "flang" projects), middle-end infrastructure (the new pass manager), back-ends (the two arm64 targets) and build systems (automake/CMake). But in all examples above, the sub-communities involved agreed on replacing the flang implementation (twice), concurrently developing the new pass manager and merging the two arm64 backends. We also have elected to keep CMake over automake. Those efforts were always with the intention to replace and not to co-exist. So none of our priors are close enough. However, those are all things that we have considered to be "core tier". We should now consider a "peripheral tier" that would be ok with co-existence, as long as they follow the support policy. For your specific point about stable releases, breaking them would be a major failure to follow the policy, so we need to make sure that doesn't happen. And this is why I think it should be a pitch. In my opinion, these> kinds of higher-level decisions are better made by review managers than > by people on the mailing list. >There should be an 100% overlap between those two groups. :) And their views on this subject would be invaluable to the discussion. It'd be beneficial to all if they could chime in. The other nice thing about a pitch is that we don't need to spend days> arguing about this on the mailing list. You can take my feedback, think > about it, and if you think there is some validity to what I have said, > then all you need to do is update your proposal to address my concerns. > And if not, then you can just move on to the next email. >That is true. It would be nice to have a document that reflects the latest updates on the discussion. cheers, --renato -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201117/6101f457/attachment.html>
Chris Tetreault via llvm-dev
2020-Nov-17 17:29 UTC
[llvm-dev] RFC: Contributing Bazel BUILD files in the "peripheral" support tier
Since this is being added within the bounds of a now-existing policy, I will withdraw my objections. Thanks for tabling discussion of adding Bazel until the policy on peripheral tier components was settled, and thanks very much to Renato for taking the initiative to push through a new policy! Unfortunately, I do not know enough about Bazel to provide any sort of useful code review. Thanks, Christopher Tetreault From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of Geoffrey Martin-Noble via llvm-dev Sent: Monday, November 16, 2020 7:02 PM To: LLVM Dev <llvm-dev at lists.llvm.org> Subject: [EXT] [llvm-dev] RFC: Contributing Bazel BUILD files in the "peripheral" support tier I previously<https://groups.google.com/g/llvm-dev/c/u07o3QREVUg/%20> proposed contributing Bazel build files to the LLVM monorepo, supported *only* by interested community members and not to interfere with or affect the existing CMake configuration. As part of that conversation, it became clear that the LLVM policies for more "peripheral" components were not clearly documented. We now have a shiny new Support Policy<http://llvm.org/docs/SupportPolicy.html>. Thank you, Renato for moving that forward. Please take a look at it, if you haven't already. I would now like to re-raise the proposal to contribute Bazel build files to the LLVM monorepo. I am starting a new thread, as the last one became rather fragmented. This build configuration would be added at the peripheral support level<http://llvm.org/docs/SupportPolicy.html#peripheral-tier> to a new `utils/bazel` directory. I've prepared a patch<https://reviews.llvm.org/D90352> of what I am proposing to add. It includes a README indicating the level of support. It is largely a port of the Bazel build files Google uses internally and has maintained for several years. This should have approximately the same impact on the community as the current GN build in `llvm/utils/gn` does today. That is, it should not affect anyone who doesn't care. I've commented on the specific requirements<http://llvm.org/docs/SupportPolicy.html#id2> listed in the support policy inline: Code in this tier must: * Have a clear benefit for residing in the main repository, catering to an active sub-community (upstream or downstream). A number of projects build LLVM with Bazel (e.g. IREE<https://github.com/google/iree>, TensorFlow<https://github.com/tensorflow/tensorflow>, PlaidML<https://github.com/plaidml/plaidml/blob/master/vendor/llvm/llvm.BUILD>). Google also uses Bazel to build in its internal source repository. This includes a substantial number of developers and active contributors to LLVM. Adding this to the monorepo would provide a natural coordination point for these projects and avoid fragmentation (projects currently have their own copies of the BUILD files) or Google-centric governance (e.g. signing Google's CLA). * Be actively maintained by such sub-community and have its problems addressed in a timely manner. We can commit to maintaining and addressing issues with the configuration. Google has maintained its internal version of this configuration for a few years. Code in this tier must not: * Break or invalidate core tier code or infrastructure. If that happens accidentally, reverting functionality and working on the issues offline is the only acceptable course of action. There should be no interaction between the Bazel build configuration and any core code or infrastructure. * Negatively affect development of core tier code, with the sub-community involved responsible for making changes to address specific concerns. This should not affect development of core tier components. One reason we propose adding this to the root utils/ directory instead of under llvm/utils (where GN is located) is to avoid unnecessarily sending messages to llvm-commits. Others have raised the concern that the existence of an alternative build system might lead to lack of maintenance for the CMake build system. Given that supporting CMake will remain a requirement and maintenance of a Bazel build system will continue to happen regardless, we do not expect any significant impact in this way. * Negatively affect other peripheral tier code, with the sub-communities involved tasked to resolve the issues, still making sure the solution doesn’t break or invalidate the core tier. Similarly, this should have no interaction with other components in the peripheral tier. We will address conflicts if they arise. * Impose sub-optimal implementation strategies on core tier components as a result of idiosyncrasies in the peripheral component. We do not expect any negative constraints on normal development of core tiers. Bazel is stricter about layering, which may help quickly identify layering issues in incoming commits. * Have build infrastructure that spams all developers about their breakages. Build infrastructure will be configured to only notify opted-in developers. * Fall into disrepair. This is a reflection of lack of an active sub-community and will result in removal. Build bots with accompanying status badges will be prominently linked from the README. Currently a Linux/Clang build bot exists and can be easily reconfigured after the code move. A windows build bot will be added soon. Code in this tier should: * Have infrastructure to test, whenever meaningful, with either no warnings or notification contained within the sub-community. * Have support and testing that scales with the complexity and resilience of the component, with the bar for simple and gracefully-degrading components (such as editor bindings) much lower than for complex components that must remain fresh with HEAD (such as experimental back-ends or alternative build systems). Build bot coverage already exists and will be expanded as described above. * Have a document making clear the status of implementation, level of support available, who the sub-community is and, if applicable, roadmap for inclusion into the core tier. The patch includes a README that should make the support level clear. I am happy to add additional language or take additional steps to make that more clear (e.g. adding `unsupported` to the directory path). * Be restricted to a specific directory or have a consistent pattern (ex. unique file suffix), making it easy to remove when necessary. All configuration is restricted to a single directory and should be trivial to remove. A number of people raised the question of "why not a separate repository". This is indeed possible: It's what we've done with https://github.com/google/llvm-bazel, which is currently used by https://github.com/google/iree. It is significantly more infrastructure, coordination, and complexity for something that is specifically a configuration for the LLVM project itself, not its own dependent or adjacent project. I believe this contribution will significantly improve the situation for downstream users that use Bazel while having minimal impact on the community at large. Thanks, Geoffrey -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201117/b070626a/attachment-0001.html>