Robinson, Paul via llvm-dev
2016-Jul-22 01:08 UTC
[llvm-dev] [RFC] One or many git repositories?
Can you please clarify your use of “cost” (bandwidth, disk space, extra command to type initially?), Developer time, barrier to entry for new contributors. Getting the sparse-checkout business right looks like it is actually non-trivial and not recommended for the git novice. *Changing* the sparse-checkout configuration later appears to be fraught with peril (easy to get wrong). The claim is to keep the existing history (I.e. not hash changes) that is currently at http://llvm.org/git/llvm.git and continue to accumulate there any new commit that would touch the llvm subdirectory of the unified repo. This would be a read-only view of course, but just like it is now. Hmmm so there's still a per-old-project view? Missed that aspect, sorry… it would let us preserve our processes in terms of integrating the flow from upstream, although being able to get a correctly linearized flow of commits from the unified repo would be preferable and we would *want* to change over. Still not clear how to make that work with a sparse checkout. --paulr From: mehdi.amini at apple.com [mailto:mehdi.amini at apple.com] Sent: Thursday, July 21, 2016 4:52 PM To: Robinson, Paul Cc: Renato Golin; Justin Lebar; llvm-dev at lists.llvm.org Subject: Re: [llvm-dev] [RFC] One or many git repositories? On Jul 21, 2016, at 4:39 PM, Robinson, Paul <paul.robinson at sony.com<mailto:paul.robinson at sony.com>> wrote: -----Original Message----- From: mehdi.amini at apple.com<mailto:mehdi.amini at apple.com> [mailto:mehdi.amini at apple.com] Sent: Thursday, July 21, 2016 3:16 PM To: Robinson, Paul Cc: Renato Golin; Justin Lebar; llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] [RFC] One or many git repositories? On Jul 21, 2016, at 2:33 PM, Robinson, Paul via llvm-dev <llvm- dev at lists.llvm.org<mailto:dev at lists.llvm.org>> wrote: On 21 July 2016 at 18:12, Justin Lebar <jlebar at google.com<mailto:jlebar at google.com>> wrote: llvm, clang, clang-tools-extra, lld, polly, lldb, llgo, compiler-rt, openmp, and parallel-libs. I really, *really* would like to see libc++ / abi / unwind. :) My reason is that, when building toolchains, the C++ ABI and unwinding are fundamental parts of the run-time library, of which RT is only part of. When building *your* toolchain... My toolchain uses clang but not libc++/abi/unwind, we have our own, and we don't currently include them in our tree. We do include compiler-rt. If we should change our minds later we can opt-in to anything else we want (libcxx etc, lld? lldb? who knows) but in the meantime they are unnecessary baggage for my purposes. As a developer, you can checkout part of the repo with sparse-checkout. I'm not clear why imposing this cost on everybody who wants less-than-all (which I'd think would be most people) Can you please clarify your use of “cost” (bandwidth, disk space, extra command to type initially?), otherwise it is hard for me to address you concerns (for instance I’m actually sensitive to the one you spelled out clearly in another email about a commit in lld requiring a rebase in llvm). is superior to the submodule thing which can be maintained centrally by people who actually understand how to do it. While I see some good principled way to have a submodule umbrella repo in git, I don’t see any *without server-side hooks* that does not have any flaw*. Unfortunately this is not addressed by Renato’s proposal, and github does not allow server-side hooks, and another git hosting service is spelled out-of-discussion for Renato’s proposal. * we may consider the flaws acceptable, but they need to be understood and accepted, and I don’t think it has been spelled out clearly in Renato’s proposal. As a downstream integrator, you can filter out the repo history as you want before merging into your repo. Hmmm maybe, maybe not. It sounds like the claim is: you can do a sparse checkout of upstream, then merge it to a different branch, and get only the history of the stuff that was sparsely checked out. No that’s not the claim (sparse checkout are totally unrelated to this part of my claim). The claim is to keep the existing history (I.e. not hash changes) that is currently at http://llvm.org/git/llvm.git and continue to accumulate there any new commit that would touch the llvm subdirectory of the unified repo. This would be a read-only view of course, but just like it is now. I.e. if you’re using the existing git repo, we can keep maintaining your workflow *as-is* forever. It means *no* migration would be forced on any CI/integration system (other than those relying on SVN). (We’d need some creativity around the git-svn-id in the commit messages for the new commits though). — Mehdi Does this work with subtree merges? Our branches are not rooted at the 'llvm' directory, and I am suspicious about what the sparse checkout config would do to the local branch. (I know, I should do the experiment myself, but right now I'm in the middle of a release-prep circus and really shouldn't be spending the time to write this email:-).) If all of this magic *does* work, then mainly it's a matter of scripting the sparse-checkout config and deploying that internally. Not free, but maybe not horrible either. --paulr — Mehdi -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160722/506e82c3/attachment.html>
Justin Lebar via llvm-dev
2016-Jul-22 01:14 UTC
[llvm-dev] [RFC] One or many git repositories?
> Developer time, barrier to entry for new contributors. Getting the sparse-checkout business right looks like it is actually non-trivial and not recommended for the git novice.It's eminently copy-pastable, and there is no possibility of data loss. I understand it's not zero cost, but I have trouble seeing how there's a meaningful comparison between - the cost of three copy-pastable commands run once, versus - the benefit of simplifying the git commands we all run tens or hundreds of times a day.> *Changing* the sparse-checkout configuration later appears to be fraught with peril (easy to get wrong).If you get it wrong, you don't have the right files in your checkout, and you get a build error about a missing file... Here too, I get that there's a nonzero possibility that one could screw this up and get themselves into trouble, but when I actually do the cost/benefit analysis, it is very hard for me to see how the costs are anywhere near the same magnitude as the benefits. On Thu, Jul 21, 2016 at 6:08 PM, Robinson, Paul <paul.robinson at sony.com> wrote:> Can you please clarify your use of “cost” (bandwidth, disk space, extra > command to type initially?), > > > > Developer time, barrier to entry for new contributors. Getting the > sparse-checkout business right looks like it is actually non-trivial and not > recommended for the git novice. *Changing* the sparse-checkout > configuration later appears to be fraught with peril (easy to get wrong). > > > > The claim is to keep the existing history (I.e. not hash changes) that is > currently at http://llvm.org/git/llvm.git and continue to accumulate there > any new commit that would touch the llvm subdirectory of the unified repo. > > This would be a read-only view of course, but just like it is now. > > > > Hmmm so there's still a per-old-project view? Missed that aspect, sorry… > it would let us preserve our processes in terms of integrating the flow from > upstream, although being able to get a correctly linearized flow of commits > from the unified repo would be preferable and we would *want* to change > over. Still not clear how to make that work with a sparse checkout. > > --paulr > > > > From: mehdi.amini at apple.com [mailto:mehdi.amini at apple.com] > Sent: Thursday, July 21, 2016 4:52 PM > > > To: Robinson, Paul > Cc: Renato Golin; Justin Lebar; llvm-dev at lists.llvm.org > Subject: Re: [llvm-dev] [RFC] One or many git repositories? > > > > > > On Jul 21, 2016, at 4:39 PM, Robinson, Paul <paul.robinson at sony.com> wrote: > > > > > > > -----Original Message----- > From: mehdi.amini at apple.com [mailto:mehdi.amini at apple.com] > Sent: Thursday, July 21, 2016 3:16 PM > To: Robinson, Paul > Cc: Renato Golin; Justin Lebar; llvm-dev at lists.llvm.org > Subject: Re: [llvm-dev] [RFC] One or many git repositories? > > > > On Jul 21, 2016, at 2:33 PM, Robinson, Paul via llvm-dev <llvm- > > dev at lists.llvm.org> wrote: > > > > On 21 July 2016 at 18:12, Justin Lebar <jlebar at google.com> wrote: > > llvm, clang, clang-tools-extra, lld, polly, lldb, llgo, compiler-rt, > openmp, and parallel-libs. > > > I really, *really* would like to see libc++ / abi / unwind. :) > > My reason is that, when building toolchains, the C++ ABI and unwinding > are fundamental parts of the run-time library, of which RT is only > part of. > > > When building *your* toolchain... > > My toolchain uses clang but not libc++/abi/unwind, we have our own, and > we don't currently include them in our tree. We do include compiler-rt. > > If we should change our minds later we can opt-in to anything else we > want (libcxx etc, lld? lldb? who knows) but in the meantime they are > unnecessary baggage for my purposes. > > > As a developer, you can checkout part of the repo with sparse-checkout. > > > I'm not clear why imposing this cost on everybody who wants less-than-all > (which I'd think would be most people) > > > > Can you please clarify your use of “cost” (bandwidth, disk space, extra > command to type initially?), otherwise it is hard for me to address you > concerns (for instance I’m actually sensitive to the one you spelled out > clearly in another email about a commit in lld requiring a rebase in llvm). > > > > is superior to the submodule thing > which can be maintained centrally by people who actually understand how to > do it. > > > > While I see some good principled way to have a submodule umbrella repo in > git, I don’t see any *without server-side hooks* that does not have any > flaw*. Unfortunately this is not addressed by Renato’s proposal, and github > does not allow server-side hooks, and another git hosting service is spelled > out-of-discussion for Renato’s proposal. > > > > * we may consider the flaws acceptable, but they need to be understood and > accepted, and I don’t think it has been spelled out clearly in Renato’s > proposal. > > > > As a downstream integrator, you can filter out the repo history as you > want before merging into your repo. > > > Hmmm maybe, maybe not. It sounds like the claim is: you can do a sparse > checkout of upstream, then merge it to a different branch, and get only > the history of the stuff that was sparsely checked out. > > > > No that’s not the claim (sparse checkout are totally unrelated to this part > of my claim). > > > > The claim is to keep the existing history (I.e. not hash changes) that is > currently at http://llvm.org/git/llvm.git and continue to accumulate there > any new commit that would touch the llvm subdirectory of the unified repo. > > This would be a read-only view of course, but just like it is now. > > > > I.e. if you’re using the existing git repo, we can keep maintaining your > workflow *as-is* forever. It means *no* migration would be forced on any > CI/integration system (other than those relying on SVN). > > (We’d need some creativity around the git-svn-id in the commit messages for > the new commits though). > > > > > > — > > Mehdi > > > > > > Does this work > with subtree merges? Our branches are not rooted at the 'llvm' directory, > and I am suspicious about what the sparse checkout config would do to the > local branch. (I know, I should do the experiment myself, but right now > I'm in the middle of a release-prep circus and really shouldn't be > spending the time to write this email:-).) > > If all of this magic *does* work, then mainly it's a matter of scripting > the sparse-checkout config and deploying that internally. Not free, but > maybe not horrible either. > --paulr > > > > — > Mehdi > >
Robinson, Paul via llvm-dev
2016-Jul-22 05:48 UTC
[llvm-dev] [RFC] One or many git repositories?
> -----Original Message----- > From: Justin Lebar [mailto:jlebar at google.com] > Sent: Thursday, July 21, 2016 6:15 PM > To: Robinson, Paul > Cc: mehdi.amini at apple.com; Renato Golin; llvm-dev at lists.llvm.org > Subject: Re: [llvm-dev] [RFC] One or many git repositories? > > > Developer time, barrier to entry for new contributors. Getting the > sparse-checkout business right looks like it is actually non-trivial and > not recommended for the git novice. > > It's eminently copy-pastable, and there is no possibility of data loss. > > I understand it's not zero cost, but I have trouble seeing how there's > a meaningful comparison between > > - the cost of three copy-pastable commands run once, versusonce per clone (picky, picky, picky...) but extra steps are always the ones you forget to do. Scriptable, so maybe not a big deal.> - the benefit of simplifying the git commands we all run tens or > hundreds of times a day.Personally I already have a script to deal with updating the entire tree; adapting to submodules would be a one-time-ever cost and I never think about it again (and never have to retrain my fingers). I'll acknowledge that people have different workflows, and there are advantages to the unified repo beyond what 'checkout' costs. The size cost of the extra sources is relatively small. So to get those benefits without the unnecessary complexity of sparse checkouts, I would like it setup so I *don't have to build* all the extra pieces even if they exist in the source tree. Build time is iteration time is lost time when building pieces I don't need or care about. Ditto the time taken to run the tests of all those pieces I don't care about. This should be a configuration-time thing (which again I have scripted and therefore don't have to retrain my fingers). If the cmake run can do that for me, I have no problem with a unified repo that holds the entire LLVM universe in it. --paulr