thr3ads.net - llvm dev - [llvm-dev] [RFC] One or many git repositories? [Jul 2016]

If this information is useful, please help other people find it:
Share via:

Sanjoy Das via llvm-dev

2016-Jul-22 00:16 UTC

[llvm-dev] [RFC] One or many git repositories?

Hi Mehdi,

I really like your idea of having a few "projected" git repositories
(i.e. capture all commits that touch llvm/ into llvm.git, all that
touch clang/ to clang.git etc.).  I think it should solve our problem
of llvm-forks-with-downstream changes very nicely (I think we won't
have to do anything, as you said).  I still want to sleep on it to see
if I can spot any issues.

@David Chisnall and others with local forks: can you spot any
potential issues with Mehdi's plan?  Are there cases where it won't
work?

-- Sanjoy

Justin Lebar via llvm-dev

2016-Jul-22 00:31 UTC

head link

[llvm-dev] [RFC] One or many git repositories?

FYI jyknight and I have a hacky script that sort of works for
translating commits to the existing monorepo.  I'm working on cleaning
it up and applying it to David Chisnall's branches.  Hopefully I'll
have something by eod tomorrow.  (This isn't to take a position on
using the existing monorepo as our new source of truth, nor to take a
position on any particular directory layout.)

I wanted to try to merge David's llvm and clang branches into a single
branch -- that would be a big usability improvement over the current
situation.  But there isn't enough information in the repositories to
recover the correct interleaving.  You could try to order by date, but
that only works so long as the history is linear...  So I gave up on
that feature.

I also kind of like the idea of these projected repositories, and if
that's sufficient, awesome, save us some work.

On Thu, Jul 21, 2016 at 5:16 PM, Sanjoy Das via llvm-dev
<llvm-dev at lists.llvm.org> wrote:> Hi Mehdi,
>
> I really like your idea of having a few "projected" git
repositories
> (i.e. capture all commits that touch llvm/ into llvm.git, all that
> touch clang/ to clang.git etc.).  I think it should solve our problem
> of llvm-forks-with-downstream changes very nicely (I think we won't
> have to do anything, as you said).  I still want to sleep on it to see
> if I can spot any issues.
>
> @David Chisnall and others with local forks: can you spot any
> potential issues with Mehdi's plan?  Are there cases where it won't
> work?
>
> -- Sanjoy
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Simon Taylor via llvm-dev

2016-Jul-22 08:16 UTC

head link

[llvm-dev] [RFC] One or many git repositories?

Hi all,

Ill start by saying Ive skimmed this thread and am not actually a user of LLVM
at all, but had some git thoughts that might be worth contributing.
> On 22 Jul 2016, at 01:16, Sanjoy Das via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> @David Chisnall and others with local forks: can you spot any
> potential issues with Mehdi's plan?  Are there cases where it won't
> work?
One potential issue is that a single commit into the monolithic repository would
potentially touch multiple subprojects (thats one of the advantages). Projecting
that into individual repositories would only commit changes to those files, but
the commit message would be maintained and might therefore be confusing in the
context of the individual repository, especially if only a small part of the
commit affects that individual sub-repo.

Essentially if the projects are supposed to be separate modules, then submodules
is the solution to enforce that independence, ensuring commits in each module
only affect that module and have appropriate commit messages for that context.

If the submodules are in practice more intertwined then that then it does feel
like an ideologically pure solution that in the end just gets in the way of
developer productivity.

Ive got a setup here that uses a hierarchy of submodules, so there is a combined
submodule that just ensures that its children (other submodules) are at mutually
compatible versions. That helped productivity (multiple consumers of the
combined submodule dont need to manually track versions of all the children) but
this discussion is pushing me towards the thought that actually a monorepo would
be a more productive solution anyway, and make more sense for cross-cutting
changes.

And sorry to throw another option into the ring; and one that might already have
been discussed and discounted, but thought it worth sharing.

1) Create a new llvm-project-mono repo
2) Use git subtree instead of git submodule to add all the directories to match
the layout of llvm-project.
3) From now on, all commits go to the monorepo
4) monorepo commits can be projected to the individual project repos, and
additionally a new commit on llvm-project can be made with the submodule version
updates

Advantages:

- No change for existing downstream users unless they want to move to the mono
view
- Easier developer experience for cross-cutting changes
- Git log by path would work identically on either view of the repository
- Hashes from before the creation of the mono repo would match in both views -
the mono repo will have multiple roots but thats not unusual with git subtree

Disadvantages:

- Step 4 from my list would need a script to keep things updated. A server-side
hook would be best. The mapping is deterministic (every mono repo commit will
map to one commit in any affected submodules and one submodule update commit in
the umbrella llvm-project repo), so if the server responsible falls over the
updates might be delayed but can be caught up without losing anything
- Less ideologically pure in terms of trying to keep the modules independent
- Commit hashes will diverge between the two views from the creation of the mono
repo, making comparisons / merges between clones of the different views more
difficult

Simon

Sean Silva via llvm-dev

2016-Jul-22 09:03 UTC

head link

[llvm-dev] [RFC] One or many git repositories?

On Fri, Jul 22, 2016 at 1:16 AM, Simon Taylor via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi all,
>
>  I’ll start by saying I’ve skimmed this thread and am not actually a user
> of LLVM at all, but had some git thoughts that might be worth contributing.
>
> > On 22 Jul 2016, at 01:16, Sanjoy Das via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >
> > @David Chisnall and others with local forks: can you spot any
> > potential issues with Mehdi's plan?  Are there cases where it
won't
> > work?
>
> One potential “issue” is that a single commit into the monolithic
> repository would potentially touch multiple subprojects (that’s one of the
> advantages). Projecting that into individual repositories would only commit
> changes to those files, but the commit message would be maintained and
> might therefore be confusing in the context of the individual repository,
> especially if only a small part of the commit affects that individual
> sub-repo.
>
What do we do now? We already have the ability to do this. See the thread
"[LLVMdev] [Git-fu] How to commit inter-repositories atomically to
svn"

-- Sean Silva

>
> Essentially if the projects are “supposed” to be separate modules, then
> submodules is the solution to enforce that independence, ensuring commits
> in each module only affect that module and have appropriate commit messages
> for that context.
>
> If the submodules are in practice more intertwined then that then it does
> feel like an ideologically pure solution that in the end just gets in the
> way of developer productivity.
>
> I’ve got a setup here that uses a hierarchy of submodules, so there is a
> “combined” submodule that just ensures that it’s children (other
> submodules) are at mutually compatible versions. That helped productivity
> (multiple consumers of the “combined” submodule don’t need to manually
> track versions of all the children) but this discussion is pushing me
> towards the thought that actually a monorepo would be a more productive
> solution anyway, and make more sense for cross-cutting changes.
>
> And sorry to throw another option into the ring; and one that might
> already have been discussed and discounted, but thought it worth sharing.
>
> 1) Create a new llvm-project-mono repo
> 2) Use git subtree instead of git submodule to add all the directories to
> match the layout of llvm-project.
> 3) From now on, all commits go to the monorepo
> 4) monorepo commits can be projected to the individual project repos, and
> additionally a new commit on llvm-project can be made with the submodule
> version updates
>
> Advantages:
>
> - No change for existing downstream users unless they want to move to the
> mono view
> - Easier developer experience for cross-cutting changes
> - Git log by path would work identically on either view of the repository
> - Hashes from before the creation of the mono repo would match in both
> views - the mono repo will have multiple roots but that’s not unusual with
> git subtree
>
> Disadvantages:
>
> - Step 4 from my list would need a script to keep things updated. A
> server-side hook would be best. The mapping is deterministic (every mono
> repo commit will map to one commit in any affected submodules and one
> “submodule update” commit in the umbrella llvm-project repo), so if the
> server responsible falls over the updates might be delayed but can be
> caught up without losing anything
> - Less ideologically pure in terms of trying to keep the modules
> independent
> - Commit hashes will diverge between the two views from the creation of
> the mono repo, making comparisons / merges between clones of the different
> views more difficult
>
>
> Simon
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160722/ade8a6a5/attachment.html>

Mehdi Amini via llvm-dev

2016-Jul-22 23:35 UTC

head link

[llvm-dev] [RFC] One or many git repositories?

> On Jul 22, 2016, at 1:16 AM, Simon Taylor via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> Hi all,
> 
> Ill start by saying Ive skimmed this thread and am not actually a user of
LLVM at all, but had some git thoughts that might be worth contributing.
> 
>> On 22 Jul 2016, at 01:16, Sanjoy Das via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
>> 
>> @David Chisnall and others with local forks: can you spot any
>> potential issues with Mehdi's plan?  Are there cases where it
won't
>> work?
> 
> One potential issue is that a single commit into the monolithic repository
would potentially touch multiple subprojects (thats one of the advantages).
Projecting that into individual repositories would only commit changes to those
files, but the commit message would be maintained and might therefore be
confusing in the context of the individual repository, especially if only a
small part of the commit affects that individual sub-repo.
> 
> Essentially if the projects are supposed to be separate modules, then
submodules is the solution to enforce that independence, ensuring commits in
each module only affect that module and have appropriate commit messages for
that context.
> 
> If the submodules are in practice more intertwined then that then it does
feel like an ideologically pure solution that in the end just gets in the way of
developer productivity.
> 
> Ive got a setup here that uses a hierarchy of submodules, so there is a
combined submodule that just ensures that its children (other submodules) are at
mutually compatible versions. That helped productivity (multiple consumers of
the combined submodule dont need to manually track versions of all the children)
but this discussion is pushing me towards the thought that actually a monorepo
would be a more productive solution anyway, and make more sense for
cross-cutting changes.
> 
> And sorry to throw another option into the ring; and one that might already
have been discussed and discounted, but thought it worth sharing.
> 
> 1) Create a new llvm-project-mono repo
> 2) Use git subtree instead of git submodule to add all the directories to
match the layout of llvm-project.
> 3) From now on, all commits go to the monorepo
> 4) monorepo commits can be projected to the individual project repos, and
additionally a new commit on llvm-project can be made with the submodule version
updates
This is what I proposed except Im not using subtree but an explicit move commit
in the existing repo before merging them:
https://github.com/joker-eph/llvm-unified

The reason I didnt go with subtree merging is that it breaks `git log --follow
path/to/file`. I suspect not many tools (blame history in a text editor) are
supporting the subtree metadata in the merge commit.

Do you any drawback to what I did instead?

 
Mehdi


> 
> Advantages:
> 
> - No change for existing downstream users unless they want to move to the
mono view
> - Easier developer experience for cross-cutting changes
> - Git log by path would work identically on either view of the repository
> - Hashes from before the creation of the mono repo would match in both
views - the mono repo will have multiple roots but thats not unusual with git
subtree
> 
> Disadvantages:
> 
> - Step 4 from my list would need a script to keep things updated. A
server-side hook would be best. The mapping is deterministic (every mono repo
commit will map to one commit in any affected submodules and one submodule
update commit in the umbrella llvm-project repo), so if the server responsible
falls over the updates might be delayed but can be caught up without losing
anything
> - Less ideologically pure in terms of trying to keep the modules
independent
> - Commit hashes will diverge between the two views from the creation of the
mono repo, making comparisons / merges between clones of the different views
more difficult
> 
> 
> Simon
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Possibly Parallel Threads

Search for more seemingly similar threads

llvm dev - Jul 2016 - [RFC] One or many git repositories?

[llvm-dev] [RFC] One or many git repositories?

[llvm-dev] [RFC] One or many git repositories?

[llvm-dev] [RFC] One or many git repositories?

[llvm-dev] [RFC] One or many git repositories?

[llvm-dev] [RFC] One or many git repositories?

Possibly Parallel Threads