Mehdi Amini via llvm-dev
2016-Sep-08 18:37 UTC
[llvm-dev] [RFC] One or many git repositories?
Sent from my iPhone> On Sep 8, 2016, at 11:08 AM, dag at cray.com wrote: > > Mehdi Amini via llvm-dev <llvm-dev at lists.llvm.org> writes: > >> First, have you read this document: https://reviews.llvm.org/D24167 ? >> >> TLDR: The answer is no: you have to see it as it is today, i.e. a >> single SVN repo containing all the sub-projects, and “exports” in >> individual repositories. > >> The same thing after: a single git repo containing all the subprojects >> side-by-side and the *same* “exports” in individual repositories. > > Sorry, I sent my earlier reply today before I intended to. > > After going back and reading the proposal again, I think I understand > the plan. I haven't used the SVN repository for years so I was thinking > in terms of git, that you'd take the existing git mirrors and combine > them (visa submodule or some other mechanism). I understand now the > proposal is to take the SVN root and export all of that as one giant git > repository. Is that correct? >Yes> If so, that raises a number of questions for me that aren't directly > addressed in the document as far as I can see: > > 1. How are the individual component git mirrors going to be maintained?Just exactly as they are today.> > If a commit goes to the monorepository, what is going to extract the > relevant bits and commit them to the individual mirrors? The document > notes that with a monorepository a single commit can touch multiple > projects (that's good!) but something has to extract the parts of that > commit that are relevant to each subproject and then send those parts to > the subproject repository.Right, but note that it is already the case today, some people are already using SVN to commit to clang and LLVM at the same time, and the same commit in SVN will result in one commit in the llvm git repo and another commit in the clang repo.> There are tools to do this and I think > git-subtree is a good candidate [disclosure: I am the git-subtree > maintainer] but I'm just curious what's being considered as a solution.Well we haven't decided on anything for the official mirrors. It looks like you're in a good position to help designing how subtree could help here :) (I have a fairly good understanding of git, but very limited knowledge of subtree) Anyway I hope will be able to put scripts in the repo so that anyone downstream can split the repo independently of official mirrors.> > 2. Is there any consideration for restructuring the directory layout? > > The document has this to say about checking out multiple components: > >> **Monorepo Proposal** >> >> The repository contains natively the source for every sub-projects at the right >> revision, which makes this straightforward:: >> >> git clone https://github.com/llvm/llvm-projects.git llvm >> cd llvm >> git checkout $REVISION >> >> As before, at this point clang, llvm, and libcxx are stored in directories >> alongside each other. > > The problem here is that for the build, clang wants to be in llvm/tools > and other components want to be in other places.Not exactly: cmake has magic discovery when clang is in tools, but it is not a requirement. You can do (for years): cmake -DLLVM_EXTERNAL_CLANG_SOURCE_DIR=path> Should the > monorepository just be structured to have everything in its correct > place for building? My inclination is to say "no" because it reduces > the visibility of the subprojects, but what are the alternatives? There > are two that come to mind off the top of my head, 1) include symlinks in > the repository or 2) change the build so all components can live at the > top level.I'd expect a cmake shortcut cmake -DLLVM_ENABLE_PROjECTS=clang,libcxx,compiler-rt> > I think it's important to think about these kinds of questions because > once a repository layout has been settled on, it's hard to change. Yes, > it is relatively easy to move entire directories to new places in git, > but that not only would require changes to whatever entity updates the > subproject repositories, it's potentially a huge social issue, which are > typically the most difficult problems to address. :) > > 3. How are the subproject repositories going to be created/migrated? > > The individual subproject repositories will have to be created from > scratch after the monrepository is created, right? We can't just > transition the existing git mirrors to the new setup, correct?It depends: there are tradeof for each option and I think we need to gather community inputs to settle on one.> A > subproject repository reboot would involve some not insignificant pain > for downstream users because their git histories are suddenly invalid. > They would have to fetch a completely different repository and integrate > it into whatever they have.If we "reboot" the official git mirrors, I expect We'd provide scripts for integrating from the new monorepo on top of the existing history. Ultimately these mirrors are "facilities" but it shouldn't be significantly harder for downstream to integrate directly from the monorepo with a bit of scripting, and I suspect this scripting is likely to be shareable and committed upstream.> > If there is some way to maintain the existing git mirrors and layer new > monorepository commits on top of the existing history that would be > fantastic. I believe it is technically possible (I might need to add > some enhancements to git-subtree :)) but I don't know if anyone has > explored this. I would love to be told you all have the answers > already. :) > > Bisecting > > For the multirepository proposal, the document talks about having the > git-bisect run script update each submodule during bisection. I suppose > that will work but the bisection would only report that the failure > exists at a particular commit in the umbrella repository, implying a > bunch of different commits, one for each subproject. It wouldn't really > point to a particular subproject as being the culprit, correct?Yes, it depends on the frequency of the update of the umbrella.> The > document even hints at this: "it is possible that one commit in the > umbrella repository includes multiple commits in the sub-projects" > > That's what I was getting at with my submodule bisect question. It can > only bisect to a granularity of "one of these subprojects at their > respective commits caused the problem." With a true monorepository > bisect can drill down to the exact commit within a subproject or across > multiple subprojects if the commit touched multiple subprojects. To me > this is a giant advantage of a non-submodule-based monorepository, which > I think is what the monorepository proposal is. > > If everything I've written here is generally correct, I think the > monorepository will work for us, as long as each subproject repository > is maintained at a granularity of one subproject commit per commit to > the corresponding directory in the monorepository (i.e. full history is > maintained). > > Thanks for you work on this. This kind of work is crucially important > but often unrecognized and underappreciated. >Thanks :) If you have any input on parts of the document that can be made more clear, feel free to chime in in the review. -- Mehdi -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160908/83f13591/attachment.html>
Mehdi Amini <mehdi.amini at apple.com> writes:> After going back and reading the proposal again, I think I > understand the plan. I haven't used the SVN repository for years > so I was thinking in terms of git, that you'd take the existing > git mirrors and combine them (visa submodule or some other > mechanism). I understand now the proposal is to take the SVN root > and export all of that as one giant git repository. Is that > correct? > > YesHooray! I got it!> If a commit goes to the monorepository, what is going to extract > the relevant bits and commit them to the individual mirrors? The > document notes that with a monorepository a single commit can > touch multiple projects (that's good!) but something has to > extract the parts of that commit that are relevant to each > subproject and then send those parts to the subproject repository. > > Right, but note that it is already the case today, some people are > already using SVN to commit to clang and LLVM at the same timeThat...is an abomination. :)> There are tools to do this and I think > git-subtree is a good candidate [disclosure: I am the git-subtree > maintainer] but I'm just curious what's being considered as a > solution. > > > Well we haven't decided on anything for the official mirrors. It looks > like you're in a good position to help designing how subtree could > help here :) > (I have a fairly good understanding of git, but very limited knowledge > of subtree)For the subtree split process, git-subtree currently uses an arcane (and SLOW!) algorithm that I presume was written before filter-branch was available. I inherited the code so I don't know the full backstory. In any event, it's buggy in some corner cases so my plan is to transition it to filter-branch so for the most common splits it would simply be a more user-friendly wrapper around filter-branch. I'm guessing that's all the LLVM ecosystem would need. There are some more intricate cases but those mostly relate to some enhancements I've made that aren't even public yet.> Anyway I hope will be able to put scripts in the repo so that anyone > downstream can split the repo independently of official mirrors.That would be excellent.> The problem here is that for the build, clang wants to be in > llvm/tools and other components want to be in other places. > > Not exactly: cmake has magic discovery when clang is in tools, but it > is not a requirement. You can do (for years): cmake - > DLLVM_EXTERNAL_CLANG_SOURCE_DIR=pathOh! I didn't know that. That makes certain things I do easier. :) Probably the clang build documents need to be updated. :)> Should the monorepository just be structured to have everything in > its correct place for building? My inclination is to say "no" > because it reduces the visibility of the subprojects, but what are > the alternatives? There are two that come to mind off the top of > my head, 1) include symlinks in the repository or 2) change the > build so all components can live at the top level. > > I'd expect a cmake shortcut cmake - > DLLVM_ENABLE_PROjECTS=clang,libcxx,compiler-rtMakes total sense.> The individual subproject repositories will have to be created > from scratch after the monrepository is created, right? We can't > just transition the existing git mirrors to the new setup, > correct? > > It depends: there are tradeof for each option and I think we need to > gather community inputs to settle on one.Yes. Lots of discussion is needed here.> A subproject repository reboot would involve some not > insignificant pain for downstream users because their git > histories are suddenly invalid. They would have to fetch a > completely different repository and integrate it into whatever > they have. > > If we "reboot" the official git mirrors, I expect > We'd provide scripts for integrating from the new monorepo on top of > the existing history.Interesting. If the existing history can be maintained and built upon that would relieve a lot of burden on users.> Ultimately these mirrors are "facilities" but it shouldn't be > significantly harder for downstream to integrate directly from the > monorepo with a bit of scripting, and I suspect this scripting is > likely to be shareable and committed upstream.I suspect you are right.> Bisecting > > For the multirepository proposal, the document talks about having > the git-bisect run script update each submodule during > bisection. I suppose that will work but the bisection would only > report that the failure exists at a particular commit in the > umbrella repository, implying a bunch of different commits, one > for each subproject. It wouldn't really point to a particular > subproject as being the culprit, correct? > > Yes, it depends on the frequency of the update of the umbrella.I see what you mean. Yes, you are correct.> Thanks for you work on this. This kind of work is crucially > important but often unrecognized and underappreciated. > > Thanks :) > > If you have any input on parts of the document that can be made more > clear, feel free to chime in in the review.Will do! -David
Renato Golin via llvm-dev
2016-Sep-08 19:49 UTC
[llvm-dev] [RFC] One or many git repositories?
On 8 September 2016 at 19:37, Mehdi Amini via llvm-dev <llvm-dev at lists.llvm.org> wrote:> I'd expect a cmake shortcut cmake > -DLLVM_ENABLE_PROjECTS=clang,libcxx,compiler-rtHey, I like this idea! In that case, we don't need the directories in any particular location, as CMake would be able to find and link any place *we* want to put them in (in tree, flat out) and pull out their CMake files. This would also help each project to be built in its own, if they so require, without upsetting the LLVM-canon build style. cheers, --renato
Alexander Benikowski via llvm-dev
2016-Sep-09 09:11 UTC
[llvm-dev] [RFC] One or many git repositories?
I'd vote for having each component in a seperated Repository and using a Monorepo with Submodules to work with. Since CLang depends on LLVM but not vice versa (if i am not mistaken. I'm new here), i'd prefer to just work with the LLVM repo if it is desired. 2016-09-08 21:49 GMT+02:00 Renato Golin via llvm-dev < llvm-dev at lists.llvm.org>:> On 8 September 2016 at 19:37, Mehdi Amini via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > I'd expect a cmake shortcut cmake > > -DLLVM_ENABLE_PROjECTS=clang,libcxx,compiler-rt > > Hey, I like this idea! > > In that case, we don't need the directories in any particular > location, as CMake would be able to find and link any place *we* want > to put them in (in tree, flat out) and pull out their CMake files. > > This would also help each project to be built in its own, if they so > require, without upsetting the LLVM-canon build style. > > cheers, > --renato > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160909/babf1e2c/attachment.html>