thr3ads.net - llvm dev - [llvm-dev] [RFC] One or many git repositories? [Jul 2016]

If this information is useful, please help other people find it:
Share via:

Robinson, Paul via llvm-dev

2016-Jul-22 01:08 UTC

[llvm-dev] [RFC] One or many git repositories?

Can you please clarify your use of “cost” (bandwidth, disk space, extra command
to type initially?),

Developer time, barrier to entry for new contributors.  Getting the
sparse-checkout business right looks like it is actually non-trivial and not
recommended for the git novice.  *Changing* the sparse-checkout configuration
later appears to be fraught with peril (easy to get wrong).

The claim is to keep the existing history (I.e. not hash changes) that is
currently at http://llvm.org/git/llvm.git and continue to accumulate there any
new commit that would touch the llvm subdirectory of the unified repo.
This would be a read-only view of course, but just like it is now.

Hmmm so there's still a per-old-project view?  Missed that aspect, sorry… 
it would let us preserve our processes in terms of integrating the flow from
upstream, although being able to get a correctly linearized flow of commits from
the unified repo would be preferable and we would *want* to change over.  Still
not clear how to make that work with a sparse checkout.
--paulr

From: mehdi.amini at apple.com [mailto:mehdi.amini at apple.com]
Sent: Thursday, July 21, 2016 4:52 PM
To: Robinson, Paul
Cc: Renato Golin; Justin Lebar; llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] [RFC] One or many git repositories?


On Jul 21, 2016, at 4:39 PM, Robinson, Paul <paul.robinson at
sony.com<mailto:paul.robinson at sony.com>> wrote:




-----Original Message-----
From: mehdi.amini at apple.com<mailto:mehdi.amini at apple.com>
[mailto:mehdi.amini at apple.com]
Sent: Thursday, July 21, 2016 3:16 PM
To: Robinson, Paul
Cc: Renato Golin; Justin Lebar; llvm-dev at lists.llvm.org<mailto:llvm-dev at
lists.llvm.org>
Subject: Re: [llvm-dev] [RFC] One or many git repositories?



On Jul 21, 2016, at 2:33 PM, Robinson, Paul via llvm-dev <llvm-
dev at lists.llvm.org<mailto:dev at lists.llvm.org>> wrote:



On 21 July 2016 at 18:12, Justin Lebar <jlebar at google.com<mailto:jlebar
at google.com>> wrote:

llvm, clang, clang-tools-extra, lld, polly, lldb, llgo, compiler-rt,
openmp, and parallel-libs.

I really, *really* would like to see libc++ / abi / unwind. :)

My reason is that, when building toolchains, the C++ ABI and unwinding
are fundamental parts of the run-time library, of which RT is only
part of.

When building *your* toolchain...

My toolchain uses clang but not libc++/abi/unwind, we have our own, and
we don't currently include them in our tree.  We do include compiler-rt.

If we should change our minds later we can opt-in to anything else we
want (libcxx etc, lld? lldb? who knows) but in the meantime they are
unnecessary baggage for my purposes.

As a developer, you can checkout part of the repo with sparse-checkout.

I'm not clear why imposing this cost on everybody who wants less-than-all
(which I'd think would be most people)

Can you please clarify your use of “cost” (bandwidth, disk space, extra command
to type initially?), otherwise it is hard for me to address you concerns (for
instance I’m actually sensitive to the one you spelled out clearly in another
email about a commit in lld requiring a rebase in llvm).

is superior to the submodule thing
which can be maintained centrally by people who actually understand how to
do it.

While I see some good principled way to have a submodule umbrella repo in git, I
don’t see any *without  server-side hooks* that does not have any flaw*.
Unfortunately this is not addressed by Renato’s proposal, and github does not
allow server-side hooks, and another git hosting service is spelled
out-of-discussion for Renato’s proposal.

* we may consider the flaws acceptable, but they need to be understood and
accepted, and I don’t think it has been spelled out clearly in Renato’s
proposal.

As a downstream integrator, you can filter out the repo history as you
want before merging into your repo.

Hmmm maybe, maybe not.  It sounds like the claim is: you can do a sparse
checkout of upstream, then merge it to a different branch, and get only
the history of the stuff that was sparsely checked out.

No that’s not the claim (sparse checkout are totally unrelated to this part of
my claim).

The claim is to keep the existing history (I.e. not hash changes) that is
currently at http://llvm.org/git/llvm.git and continue to accumulate there any
new commit that would touch the llvm subdirectory of the unified repo.
This would be a read-only view of course, but just like it is now.

I.e. if you’re using the existing git repo, we can keep maintaining your
workflow *as-is* forever. It means *no* migration would be forced on any
CI/integration system (other than those relying on SVN).
(We’d need some creativity around the git-svn-id in the commit messages for the
new commits though).


—
Mehdi



Does this work
with subtree merges?  Our branches are not rooted at the 'llvm'
directory,
and I am suspicious about what the sparse checkout config would do to the
local branch.  (I know, I should do the experiment myself, but right now
I'm in the middle of a release-prep circus and really shouldn't be
spending the time to write this email:-).)

If all of this magic *does* work, then mainly it's a matter of scripting
the sparse-checkout config and deploying that internally.  Not free, but
maybe not horrible either.
--paulr



—
Mehdi

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160722/506e82c3/attachment.html>

Justin Lebar via llvm-dev

2016-Jul-22 01:14 UTC

head link

[llvm-dev] [RFC] One or many git repositories?

> Developer time, barrier to entry for new contributors.  Getting the
sparse-checkout business right looks like it is actually non-trivial and not
recommended for the git novice.
It's eminently copy-pastable, and there is no possibility of data loss.

I understand it's not zero cost, but I have trouble seeing how there's
a meaningful comparison between

 - the cost of three copy-pastable commands run once, versus
 - the benefit of simplifying the git commands we all run tens or
hundreds of times a day.
> *Changing* the sparse-checkout configuration later appears to be fraught
with peril (easy to get wrong).
If you get it wrong, you don't have the right files in your checkout,
and you get a build error about a missing file...

Here too, I get that there's a nonzero possibility that one could
screw this up and get themselves into trouble, but when I actually do
the cost/benefit analysis, it is very hard for me to see how the costs
are anywhere near the same magnitude as the benefits.

On Thu, Jul 21, 2016 at 6:08 PM, Robinson, Paul <paul.robinson at
sony.com> wrote:> Can you please clarify your use of “cost” (bandwidth, disk space, extra
> command to type initially?),
>
>
>
> Developer time, barrier to entry for new contributors.  Getting the
> sparse-checkout business right looks like it is actually non-trivial and
not
> recommended for the git novice.  *Changing* the sparse-checkout
> configuration later appears to be fraught with peril (easy to get wrong).
>
>
>
> The claim is to keep the existing history (I.e. not hash changes) that is
> currently at http://llvm.org/git/llvm.git and continue to accumulate there
> any new commit that would touch the llvm subdirectory of the unified repo.
>
> This would be a read-only view of course, but just like it is now.
>
>
>
> Hmmm so there's still a per-old-project view?  Missed that aspect,
sorry…
> it would let us preserve our processes in terms of integrating the flow
from
> upstream, although being able to get a correctly linearized flow of commits
> from the unified repo would be preferable and we would *want* to change
> over.  Still not clear how to make that work with a sparse checkout.
>
> --paulr
>
>
>
> From: mehdi.amini at apple.com [mailto:mehdi.amini at apple.com]
> Sent: Thursday, July 21, 2016 4:52 PM
>
>
> To: Robinson, Paul
> Cc: Renato Golin; Justin Lebar; llvm-dev at lists.llvm.org
> Subject: Re: [llvm-dev] [RFC] One or many git repositories?
>
>
>
>
>
> On Jul 21, 2016, at 4:39 PM, Robinson, Paul <paul.robinson at
sony.com> wrote:
>
>
>
>
>
>
> -----Original Message-----
> From: mehdi.amini at apple.com [mailto:mehdi.amini at apple.com]
> Sent: Thursday, July 21, 2016 3:16 PM
> To: Robinson, Paul
> Cc: Renato Golin; Justin Lebar; llvm-dev at lists.llvm.org
> Subject: Re: [llvm-dev] [RFC] One or many git repositories?
>
>
>
> On Jul 21, 2016, at 2:33 PM, Robinson, Paul via llvm-dev <llvm-
>
> dev at lists.llvm.org> wrote:
>
>
>
> On 21 July 2016 at 18:12, Justin Lebar <jlebar at google.com> wrote:
>
> llvm, clang, clang-tools-extra, lld, polly, lldb, llgo, compiler-rt,
> openmp, and parallel-libs.
>
>
> I really, *really* would like to see libc++ / abi / unwind. :)
>
> My reason is that, when building toolchains, the C++ ABI and unwinding
> are fundamental parts of the run-time library, of which RT is only
> part of.
>
>
> When building *your* toolchain...
>
> My toolchain uses clang but not libc++/abi/unwind, we have our own, and
> we don't currently include them in our tree.  We do include
compiler-rt.
>
> If we should change our minds later we can opt-in to anything else we
> want (libcxx etc, lld? lldb? who knows) but in the meantime they are
> unnecessary baggage for my purposes.
>
>
> As a developer, you can checkout part of the repo with sparse-checkout.
>
>
> I'm not clear why imposing this cost on everybody who wants
less-than-all
> (which I'd think would be most people)
>
>
>
> Can you please clarify your use of “cost” (bandwidth, disk space, extra
> command to type initially?), otherwise it is hard for me to address you
> concerns (for instance I’m actually sensitive to the one you spelled out
> clearly in another email about a commit in lld requiring a rebase in llvm).
>
>
>
> is superior to the submodule thing
> which can be maintained centrally by people who actually understand how to
> do it.
>
>
>
> While I see some good principled way to have a submodule umbrella repo in
> git, I don’t see any *without  server-side hooks* that does not have any
> flaw*. Unfortunately this is not addressed by Renato’s proposal, and github
> does not allow server-side hooks, and another git hosting service is
spelled
> out-of-discussion for Renato’s proposal.
>
>
>
> * we may consider the flaws acceptable, but they need to be understood and
> accepted, and I don’t think it has been spelled out clearly in Renato’s
> proposal.
>
>
>
> As a downstream integrator, you can filter out the repo history as you
> want before merging into your repo.
>
>
> Hmmm maybe, maybe not.  It sounds like the claim is: you can do a sparse
> checkout of upstream, then merge it to a different branch, and get only
> the history of the stuff that was sparsely checked out.
>
>
>
> No that’s not the claim (sparse checkout are totally unrelated to this part
> of my claim).
>
>
>
> The claim is to keep the existing history (I.e. not hash changes) that is
> currently at http://llvm.org/git/llvm.git and continue to accumulate there
> any new commit that would touch the llvm subdirectory of the unified repo.
>
> This would be a read-only view of course, but just like it is now.
>
>
>
> I.e. if you’re using the existing git repo, we can keep maintaining your
> workflow *as-is* forever. It means *no* migration would be forced on any
> CI/integration system (other than those relying on SVN).
>
> (We’d need some creativity around the git-svn-id in the commit messages for
> the new commits though).
>
>
>
>
>
> —
>
> Mehdi
>
>
>
>
>
> Does this work
> with subtree merges?  Our branches are not rooted at the 'llvm'
directory,
> and I am suspicious about what the sparse checkout config would do to the
> local branch.  (I know, I should do the experiment myself, but right now
> I'm in the middle of a release-prep circus and really shouldn't be
> spending the time to write this email:-).)
>
> If all of this magic *does* work, then mainly it's a matter of
scripting
> the sparse-checkout config and deploying that internally.  Not free, but
> maybe not horrible either.
> --paulr
>
>
>
> —
> Mehdi
>
>

Robinson, Paul via llvm-dev

2016-Jul-22 05:48 UTC

head link

[llvm-dev] [RFC] One or many git repositories?

> -----Original Message-----
> From: Justin Lebar [mailto:jlebar at google.com]
> Sent: Thursday, July 21, 2016 6:15 PM
> To: Robinson, Paul
> Cc: mehdi.amini at apple.com; Renato Golin; llvm-dev at lists.llvm.org
> Subject: Re: [llvm-dev] [RFC] One or many git repositories?
> 
> > Developer time, barrier to entry for new contributors.  Getting the
> sparse-checkout business right looks like it is actually non-trivial and
> not recommended for the git novice.
> 
> It's eminently copy-pastable, and there is no possibility of data loss.
> 
> I understand it's not zero cost, but I have trouble seeing how
there's
> a meaningful comparison between
> 
>  - the cost of three copy-pastable commands run once, versus
once per clone (picky, picky, picky...) but extra steps are always
the ones you forget to do.  Scriptable, so maybe not a big deal.
>  - the benefit of simplifying the git commands we all run tens or
> hundreds of times a day.
Personally I already have a script to deal with updating the entire
tree; adapting to submodules would be a one-time-ever cost and I
never think about it again (and never have to retrain my fingers).

I'll acknowledge that people have different workflows, and there are
advantages to the unified repo beyond what 'checkout' costs.  The
size cost of the extra sources is relatively small.  So to get those
benefits without the unnecessary complexity of sparse checkouts,
I would like it setup so I *don't have to build* all the extra pieces
even if they exist in the source tree.  Build time is iteration time
is lost time when building pieces I don't need or care about.  Ditto
the time taken to run the tests of all those pieces I don't care about.
This should be a configuration-time thing (which again I have scripted
and therefore don't have to retrain my fingers).  If the cmake run
can do that for me, I have no problem with a unified repo that holds
the entire LLVM universe in it.
--paulr

llvm dev - Jul 2016 - [RFC] One or many git repositories?

[llvm-dev] [RFC] One or many git repositories?

[llvm-dev] [RFC] One or many git repositories?

[llvm-dev] [RFC] One or many git repositories?