thr3ads.net - llvm dev - [llvm-dev] [RFC] One or many git repositories? [Jul 2016]

If this information is useful, please help other people find it:
Share via:

Robinson, Paul via llvm-dev

2016-Jul-21 23:39 UTC

[llvm-dev] [RFC] One or many git repositories?

> -----Original Message-----
> From: mehdi.amini at apple.com [mailto:mehdi.amini at apple.com]
> Sent: Thursday, July 21, 2016 3:16 PM
> To: Robinson, Paul
> Cc: Renato Golin; Justin Lebar; llvm-dev at lists.llvm.org
> Subject: Re: [llvm-dev] [RFC] One or many git repositories?
> 
> 
> > On Jul 21, 2016, at 2:33 PM, Robinson, Paul via llvm-dev <llvm-
> dev at lists.llvm.org> wrote:
> >
> >> On 21 July 2016 at 18:12, Justin Lebar <jlebar at
google.com> wrote:
> >>>  llvm, clang, clang-tools-extra, lld, polly, lldb, llgo,
compiler-rt,
> >>> openmp, and parallel-libs.
> >>
> >> I really, *really* would like to see libc++ / abi / unwind. :)
> >>
> >> My reason is that, when building toolchains, the C++ ABI and
unwinding
> >> are fundamental parts of the run-time library, of which RT is only
> >> part of.
> >
> > When building *your* toolchain...
> >
> > My toolchain uses clang but not libc++/abi/unwind, we have our own,
and
> > we don't currently include them in our tree.  We do include
compiler-rt.
> >
> > If we should change our minds later we can opt-in to anything else we
> > want (libcxx etc, lld? lldb? who knows) but in the meantime they are
> > unnecessary baggage for my purposes.
> 
> As a developer, you can checkout part of the repo with sparse-checkout.
I'm not clear why imposing this cost on everybody who wants less-than-all
(which I'd think would be most people) is superior to the submodule thing
which can be maintained centrally by people who actually understand how to 
do it.
> As a downstream integrator, you can filter out the repo history as you
> want before merging into your repo.
Hmmm maybe, maybe not.  It sounds like the claim is: you can do a sparse
checkout of upstream, then merge it to a different branch, and get only 
the history of the stuff that was sparsely checked out.  Does this work 
with subtree merges?  Our branches are not rooted at the 'llvm'
directory,
and I am suspicious about what the sparse checkout config would do to the 
local branch.  (I know, I should do the experiment myself, but right now 
I'm in the middle of a release-prep circus and really shouldn't be 
spending the time to write this email:-).)

If all of this magic *does* work, then mainly it's a matter of scripting
the sparse-checkout config and deploying that internally.  Not free, but
maybe not horrible either.
--paulr
> 
> —
> Mehdi
>

Mehdi Amini via llvm-dev

2016-Jul-21 23:51 UTC

head link

[llvm-dev] [RFC] One or many git repositories?

> On Jul 21, 2016, at 4:39 PM, Robinson, Paul <paul.robinson at
sony.com> wrote:
> 
> 
> 
>> -----Original Message-----
>> From: mehdi.amini at apple.com [mailto:mehdi.amini at apple.com]
>> Sent: Thursday, July 21, 2016 3:16 PM
>> To: Robinson, Paul
>> Cc: Renato Golin; Justin Lebar; llvm-dev at lists.llvm.org
>> Subject: Re: [llvm-dev] [RFC] One or many git repositories?
>> 
>> 
>>> On Jul 21, 2016, at 2:33 PM, Robinson, Paul via llvm-dev <llvm-
>> dev at lists.llvm.org> wrote:
>>> 
>>>> On 21 July 2016 at 18:12, Justin Lebar <jlebar at
google.com> wrote:
>>>>> llvm, clang, clang-tools-extra, lld, polly, lldb, llgo,
compiler-rt,
>>>>> openmp, and parallel-libs.
>>>> 
>>>> I really, *really* would like to see libc++ / abi / unwind. :)
>>>> 
>>>> My reason is that, when building toolchains, the C++ ABI and
unwinding
>>>> are fundamental parts of the run-time library, of which RT is
only
>>>> part of.
>>> 
>>> When building *your* toolchain...
>>> 
>>> My toolchain uses clang but not libc++/abi/unwind, we have our own,
and
>>> we don't currently include them in our tree.  We do include
compiler-rt.
>>> 
>>> If we should change our minds later we can opt-in to anything else
we
>>> want (libcxx etc, lld? lldb? who knows) but in the meantime they
are
>>> unnecessary baggage for my purposes.
>> 
>> As a developer, you can checkout part of the repo with sparse-checkout.
> 
> I'm not clear why imposing this cost on everybody who wants
less-than-all
> (which I'd think would be most people)
Can you please clarify your use of “cost” (bandwidth, disk space, extra command
to type initially?), otherwise it is hard for me to address you concerns (for
instance I’m actually sensitive to the one you spelled out clearly in another
email about a commit in lld requiring a rebase in llvm).
> is superior to the submodule thing
> which can be maintained centrally by people who actually understand how to 
> do it.
While I see some good principled way to have a submodule umbrella repo in git, I
don’t see any *without  server-side hooks* that does not have any flaw*.
Unfortunately this is not addressed by Renato’s proposal, and github does not
allow server-side hooks, and another git hosting service is spelled
out-of-discussion for Renato’s proposal.

* we may consider the flaws acceptable, but they need to be understood and
accepted, and I don’t think it has been spelled out clearly in Renato’s
proposal.
>> As a downstream integrator, you can filter out the repo history as you
>> want before merging into your repo.
> 
> Hmmm maybe, maybe not.  It sounds like the claim is: you can do a sparse
> checkout of upstream, then merge it to a different branch, and get only 
> the history of the stuff that was sparsely checked out.  
No that’s not the claim (sparse checkout are totally unrelated to this part of
my claim).

The claim is to keep the existing history (I.e. not hash changes) that is
currently at http://llvm.org/git/llvm.git and continue to accumulate there any
new commit that would touch the llvm subdirectory of the unified repo.
This would be a read-only view of course, but just like it is now.

I.e. if you’re using the existing git repo, we can keep maintaining your
workflow *as-is* forever. It means *no* migration would be forced on any
CI/integration system (other than those relying on SVN).
(We’d need some creativity around the git-svn-id in the commit messages for the
new commits though).


— 
Mehdi

> Does this work 
> with subtree merges?  Our branches are not rooted at the 'llvm'
directory,
> and I am suspicious about what the sparse checkout config would do to the 
> local branch.  (I know, I should do the experiment myself, but right now 
> I'm in the middle of a release-prep circus and really shouldn't be 
> spending the time to write this email:-).)
> 
> If all of this magic *does* work, then mainly it's a matter of
scripting
> the sparse-checkout config and deploying that internally.  Not free, but
> maybe not horrible either.
> --paulr
> 
>> 
>> —
>> Mehdi
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160721/bbe6d319/attachment-0001.html>

Sanjoy Das via llvm-dev

2016-Jul-22 00:16 UTC

head link

[llvm-dev] [RFC] One or many git repositories?

Hi Mehdi,

I really like your idea of having a few "projected" git repositories
(i.e. capture all commits that touch llvm/ into llvm.git, all that
touch clang/ to clang.git etc.).  I think it should solve our problem
of llvm-forks-with-downstream changes very nicely (I think we won't
have to do anything, as you said).  I still want to sleep on it to see
if I can spot any issues.

@David Chisnall and others with local forks: can you spot any
potential issues with Mehdi's plan?  Are there cases where it won't
work?

-- Sanjoy

Robinson, Paul via llvm-dev

2016-Jul-22 01:08 UTC

head link

[llvm-dev] [RFC] One or many git repositories?

Can you please clarify your use of “cost” (bandwidth, disk space, extra command
to type initially?),

Developer time, barrier to entry for new contributors.  Getting the
sparse-checkout business right looks like it is actually non-trivial and not
recommended for the git novice.  *Changing* the sparse-checkout configuration
later appears to be fraught with peril (easy to get wrong).

The claim is to keep the existing history (I.e. not hash changes) that is
currently at http://llvm.org/git/llvm.git and continue to accumulate there any
new commit that would touch the llvm subdirectory of the unified repo.
This would be a read-only view of course, but just like it is now.

Hmmm so there's still a per-old-project view?  Missed that aspect, sorry… 
it would let us preserve our processes in terms of integrating the flow from
upstream, although being able to get a correctly linearized flow of commits from
the unified repo would be preferable and we would *want* to change over.  Still
not clear how to make that work with a sparse checkout.
--paulr

From: mehdi.amini at apple.com [mailto:mehdi.amini at apple.com]
Sent: Thursday, July 21, 2016 4:52 PM
To: Robinson, Paul
Cc: Renato Golin; Justin Lebar; llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] [RFC] One or many git repositories?


On Jul 21, 2016, at 4:39 PM, Robinson, Paul <paul.robinson at
sony.com<mailto:paul.robinson at sony.com>> wrote:




-----Original Message-----
From: mehdi.amini at apple.com<mailto:mehdi.amini at apple.com>
[mailto:mehdi.amini at apple.com]
Sent: Thursday, July 21, 2016 3:16 PM
To: Robinson, Paul
Cc: Renato Golin; Justin Lebar; llvm-dev at lists.llvm.org<mailto:llvm-dev at
lists.llvm.org>
Subject: Re: [llvm-dev] [RFC] One or many git repositories?



On Jul 21, 2016, at 2:33 PM, Robinson, Paul via llvm-dev <llvm-
dev at lists.llvm.org<mailto:dev at lists.llvm.org>> wrote:



On 21 July 2016 at 18:12, Justin Lebar <jlebar at google.com<mailto:jlebar
at google.com>> wrote:

llvm, clang, clang-tools-extra, lld, polly, lldb, llgo, compiler-rt,
openmp, and parallel-libs.

I really, *really* would like to see libc++ / abi / unwind. :)

My reason is that, when building toolchains, the C++ ABI and unwinding
are fundamental parts of the run-time library, of which RT is only
part of.

When building *your* toolchain...

My toolchain uses clang but not libc++/abi/unwind, we have our own, and
we don't currently include them in our tree.  We do include compiler-rt.

If we should change our minds later we can opt-in to anything else we
want (libcxx etc, lld? lldb? who knows) but in the meantime they are
unnecessary baggage for my purposes.

As a developer, you can checkout part of the repo with sparse-checkout.

I'm not clear why imposing this cost on everybody who wants less-than-all
(which I'd think would be most people)

Can you please clarify your use of “cost” (bandwidth, disk space, extra command
to type initially?), otherwise it is hard for me to address you concerns (for
instance I’m actually sensitive to the one you spelled out clearly in another
email about a commit in lld requiring a rebase in llvm).

is superior to the submodule thing
which can be maintained centrally by people who actually understand how to
do it.

While I see some good principled way to have a submodule umbrella repo in git, I
don’t see any *without  server-side hooks* that does not have any flaw*.
Unfortunately this is not addressed by Renato’s proposal, and github does not
allow server-side hooks, and another git hosting service is spelled
out-of-discussion for Renato’s proposal.

* we may consider the flaws acceptable, but they need to be understood and
accepted, and I don’t think it has been spelled out clearly in Renato’s
proposal.

As a downstream integrator, you can filter out the repo history as you
want before merging into your repo.

Hmmm maybe, maybe not.  It sounds like the claim is: you can do a sparse
checkout of upstream, then merge it to a different branch, and get only
the history of the stuff that was sparsely checked out.

No that’s not the claim (sparse checkout are totally unrelated to this part of
my claim).

The claim is to keep the existing history (I.e. not hash changes) that is
currently at http://llvm.org/git/llvm.git and continue to accumulate there any
new commit that would touch the llvm subdirectory of the unified repo.
This would be a read-only view of course, but just like it is now.

I.e. if you’re using the existing git repo, we can keep maintaining your
workflow *as-is* forever. It means *no* migration would be forced on any
CI/integration system (other than those relying on SVN).
(We’d need some creativity around the git-svn-id in the commit messages for the
new commits though).


—
Mehdi



Does this work
with subtree merges?  Our branches are not rooted at the 'llvm'
directory,
and I am suspicious about what the sparse checkout config would do to the
local branch.  (I know, I should do the experiment myself, but right now
I'm in the middle of a release-prep circus and really shouldn't be
spending the time to write this email:-).)

If all of this magic *does* work, then mainly it's a matter of scripting
the sparse-checkout config and deploying that internally.  Not free, but
maybe not horrible either.
--paulr



—
Mehdi

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160722/506e82c3/attachment.html>

Renato Golin via llvm-dev

2016-Jul-22 09:48 UTC

head link

[llvm-dev] [RFC] One or many git repositories?

On 22 July 2016 at 00:51, Mehdi Amini <mehdi.amini at apple.com>
wrote:> While I see some good principled way to have a submodule umbrella repo in
> git, I don’t see any *without  server-side hooks* that does not have any
> flaw*. Unfortunately this is not addressed by Renato’s proposal, and github
> does not allow server-side hooks, and another git hosting service is
spelled
> out-of-discussion for Renato’s proposal.
*Please*, stop calling it *my* proposal. It absolutely wasn't.

I'll repeat, as people seem to prefer repeated arguments than
references to past emails, but:

* I have sent a number of concerns and options
* People have favoured GitHub with sub-modules (I hadn't)
* I summarised the first proposal, which seemed to be reaching consensus

Let's call it "First Proposal" or "GitHubSubMod"
proposal:

http://llvm.org/docs/Proposals/GitHubSubMod.html

Everything "out-of-discussion" on the first proposal was in the
interest of reaching a self-contained proposal, and had absolutely no
ulterior motive.

Now the proposal is there, best we could make it. If there are
technical flaws, by all means, send a review to that document, but you
can't change that proposal into something else.

You can, however, create a new one, and that's what you're doing.

As people said earlier, getting to know one proposal well, has shown
many people that the "consensus" might not have been the best way
forward, but that was only possible by actually finalising at least
one proposal.

My assumption was that a survey would take us to the next step
(finding the precise and impersonal problems with that proposal), but
it seems I didn't need that. I stand corrected.

One thing your proposal doesn't even touch is where the repo will be.
I know it's basically orthogonal, but it's one of the key reasons why
we need to move. I have no preference, as long as the solution is
maintainable and cater for our needs.

My personal opinion is to host somewhere professional unless there's a
good reason not to.

If we use external hosting, GitHub is the best because there are
already thousands of forks (see Chisnall's email) there already, and
people do come to the list thinking the GitHub repo is our official
one.

If we don't, we'll have to understand the costs and who's going to
maintain it (volunteer vs. hired help). Relying on volunteers (like
myself) is extremely risky and I'd very much rather not go that way.
Relying on any company can create bias (or the impression of bias),
which can divide the community.

Again, I'm not pushing *any* agenda, just laying out the issues. But
if you want to compete with the first proposal, you *have* to have a
complete proposal, with all the pros and cons clearly laid out.

cheers,
--renato

PS: We may need a grid of proposals ({external, local} x {submod,
monolithic})...

Reasonably Related Threads

Search for more apparently analagous threads

llvm dev - Jul 2016 - [RFC] One or many git repositories?

[llvm-dev] [RFC] One or many git repositories?

[llvm-dev] [RFC] One or many git repositories?

[llvm-dev] [RFC] One or many git repositories?

[llvm-dev] [RFC] One or many git repositories?

[llvm-dev] [RFC] One or many git repositories?

Reasonably Related Threads