David Chisnall via llvm-dev
2016-Jul-21 09:51 UTC
[llvm-dev] [RFC] One or many git repositories?
On 21 Jul 2016, at 07:12, Renato Golin via llvm-dev <llvm-dev at lists.llvm.org> wrote:> >> I don't much care which of those is chosen. I have a slight preference for >> #1, for ease of doing things like grep/log/etc on llvm by itself, excluding >> all the other projects. But either way seems probably fine, and an >> improvement over multiple repositories. > > I don't have a strong preference, but #1 proponents weakly convinced > me with two arguments: > > 1. it is easier to mix-and-match repositories as you like > > I'd still symlink as I do today, but I can see why this would be > interesting for off-tree users. > > 2. it "makes more sense" to let Clang *use* LLVM instead of LLVM *host* Clang > > this seems more preference than anything, but people that know CMake > more than I do said it would be "easier" and I trust them. I have no > technical arguments pro or against. > > Though, I'd be fine with anything really.First of all, thank you very much for driving this Renato. It’s a horrible task to do and I’m very grateful that you’ve taken this on. I would, however, like to add one argument against a single repo model. If you look at the current LLVM GitHub repo, GitHub is tracking 806 forks. It is tracking 595 forks for clang. Not everyone using git for downstream development has a fork on GitHub. In particular, GitHub does not allow private forks of public repos, so anyone who has a non-public git fork of LLVM will have done a git clone and a git push to their own private repo (on or off GitHub). I know of about a dozen such private repos and (for some bizarre reason) most companies don’t tell me about the secret things that they’re doing with LLVM so there are undoubtedly a lot more that I don’t know about. Conservatively, I would estimate that we have at least a thousand downstream forks of the current LLVM git repository. Moving to a single repo model with break all of them. It is completely unacceptable to break so many downstream consumers unless we are able to provide them with some coherent migration plan, but I have not seen anyone in the single-repo camp suggest anything. David -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3719 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160721/fccc4f19/attachment.bin>
James Y Knight via llvm-dev
2016-Jul-21 15:11 UTC
[llvm-dev] [RFC] One or many git repositories?
That is a good point. With the multi-repo plan, we were planning to take the existing git repositories that everyone's already using, and has work based on, and make them official. However, with the single-repo plan, we'd be making a brand new git repository, with an integrated/interleaved history. As such, all the commit-hashes would be different, and even the directory layout will be different from the current git-svn repositories. And so we would "strand" all existing forks -- they'll be unable to easily pull in new changes to these repositories after the migration. That we'll be getting incompatible history has been glossed over, and it is indeed really important to make it clear and have a good plan there. This doesn't only affect actual "forks", it also affects every single developer with a local git clone which contains unfinished work. Therefore, we must come up with a plan to allow such users to rebase their existing work onto the new repository structure. Either documentation describing the git commands people need to run, or if it's really complicated, a script. I don't think this is a really hard problem though -- I can think of a few ways to help existing users that probably will work (although I'd want to try them first, to ensure it actually does work, of course). The two I'm thinking of are just doing "git diff" followed by "git apply --directory=llvm" if you just want to save a patch. Or, some "git filter-branch" invocation to rename all the files in your existing repo, followed by "git rebase" (or "git merge"), if you have some more history you want to maintain. To me, it seems eminently worth it to pay a one-time transition cost like that, if it makes life easier afterwards, which I believe the single-repo system would do. As long as it's documented well so not every developer needs to figure out out on their own. On Thu, Jul 21, 2016 at 2:51 AM, David Chisnall <david.chisnall at cl.cam.ac.uk> wrote:> On 21 Jul 2016, at 07:12, Renato Golin via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > >> I don't much care which of those is chosen. I have a slight preference > for > >> #1, for ease of doing things like grep/log/etc on llvm by itself, > excluding > >> all the other projects. But either way seems probably fine, and an > >> improvement over multiple repositories. > > > > I don't have a strong preference, but #1 proponents weakly convinced > > me with two arguments: > > > > 1. it is easier to mix-and-match repositories as you like > > > > I'd still symlink as I do today, but I can see why this would be > > interesting for off-tree users. > > > > 2. it "makes more sense" to let Clang *use* LLVM instead of LLVM *host* > Clang > > > > this seems more preference than anything, but people that know CMake > > more than I do said it would be "easier" and I trust them. I have no > > technical arguments pro or against. > > > > Though, I'd be fine with anything really. > > First of all, thank you very much for driving this Renato. It’s a > horrible task to do and I’m very grateful that you’ve taken this on. > > I would, however, like to add one argument against a single repo model. > If you look at the current LLVM GitHub repo, GitHub is tracking 806 forks. > It is tracking 595 forks for clang. Not everyone using git for downstream > development has a fork on GitHub. In particular, GitHub does not allow > private forks of public repos, so anyone who has a non-public git fork of > LLVM will have done a git clone and a git push to their own private repo > (on or off GitHub). I know of about a dozen such private repos and (for > some bizarre reason) most companies don’t tell me about the secret things > that they’re doing with LLVM so there are undoubtedly a lot more that I > don’t know about. > > Conservatively, I would estimate that we have at least a thousand > downstream forks of the current LLVM git repository. Moving to a single > repo model with break all of them. It is completely unacceptable to break > so many downstream consumers unless we are able to provide them with some > coherent migration plan, but I have not seen anyone in the single-repo camp > suggest anything. > > David > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160721/cc5320d3/attachment-0001.html>
David Chisnall via llvm-dev
2016-Jul-21 15:22 UTC
[llvm-dev] [RFC] One or many git repositories?
On 21 Jul 2016, at 16:11, James Y Knight <jyknight at google.com> wrote:> > I don't think this is a really hard problem though -- I can think of a few ways to help existing users that probably will work (although I'd want to try them first, to ensure it actually does work, of course). The two I'm thinking of are just doing "git diff" followed by "git apply --directory=llvm" if you just want to save a patch. Or, some "git filter-branch" invocation to rename all the files in your existing repo, followed by "git rebase" (or "git merge"), if you have some more history you want to maintain.Our clones of LLVM and clang have a reasonable amount of history (a couple of hundred commits, I believe), including multiple branches, that we’d want to preserve. Both branches have merged from upstream multiple times. It’s one of the smaller friendly forks that I know about. I’ve not used git filter-branch before, but I’d be very impressed if there is some simple invocation that can can move from this model. I was in favour of the GitHub migration primarily because a lot of downstream LLVM users already have a workflow based around GitHub that works well and the proposal was to make this closer to the official workflow. I’m very nervous about a last-minute change to require everyone downstream to restructure their workflows. In particular, the fact that we have a third more public GitHub forks of LLVM than of clang, and eight times as many as of lldb implies to me that forcing everyone downstream to pull in all subprojects would not be particularly well received. David -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3719 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160721/900b3418/attachment.bin>