Robinson, Paul via llvm-dev
2016-Jul-31 05:38 UTC
[llvm-dev] [RFC] One or many git repositories?
> The only thing a monorepo gets you that strictly isn’t possible without > it is the ability to commit to multiple projects in a single commit. > Personally I don’t think that is a big enough justification, but that is > my opinion, not a fact.Okay, I just bumped into r277008, in which commits to llvm, clang, and clang-tools-extra all have the same SVN revision number. I don't know how it happened but it did. Is this just an artifact of how somebody pasted together a bunch of git-svn projects, or is it something that a top-level git repo with submodules would allow? And if it is, then the "only thing a monorepo gets you" isn't something that you need a monorepo to get. Your befuddled correspondent, --paulr
Justin Lebar via llvm-dev
2016-Jul-31 07:06 UTC
[llvm-dev] [RFC] One or many git repositories?
> And if it is, then the "only thing a monorepo gets you" isn't something that you need a monorepo to get.This is an *extremely important* point to understand, so let me try to be really clear about the current state of the world and the state of the world under the two "move to git" proposals. Today, all commits ultimately end up in SVN. Our SVN is a effectively a monorepo, so today, a single commit can touch multiple subprojects. How you get the commit into SVN is your business. Maybe you can hack git-svn somehow to do the atomic commit. (If this is possible, it's beyond my ken.) Alternatively you can just commit via SVN. If you're a git user, I wrote a hacky script [1] that cherry-picks commits from the existing monorepo mirror and commits them via SVN. It's annoying to do, but it is possible today to atomically commit to multiple subprojects, as you observed. Under the monorepo proposal, this becomes much easier. It's just "git commit", no magic. Under the multirepo git proposal, this becomes either impossible or much more complicated. Under the proposal, we have separate git repositories for each subproject, and we push directly to these. There's then an umbrella repository, which includes the subproject repos as git submodules. There's a script which periodically checks the subproject repos for updates. When it sees an update, it creates a new commit in the umbrella repository. The script is the only thing that can create commits in the umbrella repo. In order to get atomic commits in the multirepo world, we would need some way to inform the script that two otherwise separate commits should appear in the umbrella repo as a single commit. We'd probably need to agree on a protocol communicated via commit messages. We'd also probably need client-side scripts to set the commit messages appropriately. I expect this would be so much of a hassle, even if we managed to implement it on the server side, it would be prohibitively complex for most users. In addition, under the multirepo, you only get synchronized subproject commits in your local checkout if you choose to use a git-submodules based workflow. If you use the workflow that we currently have, then on the client side, there is no guarantee that your subprojects will be sync'ed. (This is the same as most peoples' client-side git workflows today.) *Even if we manage to atomically commit across subprojects*, that is of limited utility unless those commits show up atomically on developers' workstations. But using a workflow based on git-submodules is highly complex as compared to the monorepo -- this was what I was trying to illustrate in my very first email on this thread. When we say "the monorepo gets you atomic commits," that's an abbreviation for 1) The monorepo makes it far simpler to make atomic commits from git as compared to the current SVN setup. 2) Atomic commits are definitely possible in the monorepo. They are theoretically possible in the multirepo, with extensive tooling etc. 3) Under the basic monorepo workflow, your checkouts are always correct with respect to atomic commits. Under the basic multirepo workflow, this is not true -- you have to engage with git submodules to get this property, and that is a giant pain. Sorry for the wall of text, but this is important. [1] https://github.com/jlebar/llvm-repo-tools. Be careful, I've only made one commit with it so far. :) On Sat, Jul 30, 2016 at 10:38 PM, Robinson, Paul <paul.robinson at sony.com> wrote:>> The only thing a monorepo gets you that strictly isn’t possible without >> it is the ability to commit to multiple projects in a single commit. >> Personally I don’t think that is a big enough justification, but that is >> my opinion, not a fact. > > Okay, I just bumped into r277008, in which commits to llvm, clang, and > clang-tools-extra all have the same SVN revision number. > I don't know how it happened but it did. Is this just an artifact of > how somebody pasted together a bunch of git-svn projects, or is it > something that a top-level git repo with submodules would allow? > And if it is, then the "only thing a monorepo gets you" isn't something > that you need a monorepo to get. > Your befuddled correspondent, > --paulr >
Justin Lebar via llvm-dev
2016-Jul-31 07:25 UTC
[llvm-dev] [RFC] One or many git repositories?
By the way, I've been using the existing read-only monorepo [1] for a few days now. The intent is to commit via the script I put together [2], although I haven't committed anything other than a testing commit [3]. All I can say is, *wow* is it nice. I hid everything I don't care about using a sparse checkout [4]. Many of my tools (e.g. ctrl-p [5] [6], ycm [7]) suddenly work better now that there isn't an artificial boundary between my clang and llvm repositories. I can have patch queues that include LLVM commits and clang commits arbitrarily interspersed with one another -- something I didn't realize I wanted until I made the switch and noticed I already had branches I could merge (and something we can't do with Bogner's suggested multirepo workflow). [1] https://github.com/llvm-project/llvm-project [2] https://github.com/jlebar/llvm-repo-tools [3] https://github.com/llvm-project/llvm-project/commit/38a6db646d8f43cd9d7cec6c0533e40946cd162f (which, embarrassingly, has a typo in the commit message) [4] http://jasonkarns.com/blog/subdirectory-checkouts-with-git-sparse-checkout/ [5] https://github.com/kien/ctrlp.vim [6] https://github.com/jlebar/ctrlp-py-matcher [7] https://github.com/Valloric/YouCompleteMe On Sun, Jul 31, 2016 at 12:06 AM, Justin Lebar <jlebar at google.com> wrote:>> And if it is, then the "only thing a monorepo gets you" isn't something that you need a monorepo to get. > > This is an *extremely important* point to understand, so let me try to > be really clear about the current state of the world and the state of > the world under the two "move to git" proposals. > > Today, all commits ultimately end up in SVN. Our SVN is a effectively > a monorepo, so today, a single commit can touch multiple subprojects. > How you get the commit into SVN is your business. Maybe you can hack > git-svn somehow to do the atomic commit. (If this is possible, it's > beyond my ken.) Alternatively you can just commit via SVN. If you're > a git user, I wrote a hacky script [1] that cherry-picks commits from > the existing monorepo mirror and commits them via SVN. It's annoying > to do, but it is possible today to atomically commit to multiple > subprojects, as you observed. > > Under the monorepo proposal, this becomes much easier. It's just "git > commit", no magic. > > Under the multirepo git proposal, this becomes either impossible or > much more complicated. Under the proposal, we have separate git > repositories for each subproject, and we push directly to these. > There's then an umbrella repository, which includes the subproject > repos as git submodules. There's a script which periodically checks > the subproject repos for updates. When it sees an update, it creates > a new commit in the umbrella repository. The script is the only thing > that can create commits in the umbrella repo. > > In order to get atomic commits in the multirepo world, we would need > some way to inform the script that two otherwise separate commits > should appear in the umbrella repo as a single commit. We'd probably > need to agree on a protocol communicated via commit messages. We'd > also probably need client-side scripts to set the commit messages > appropriately. > > I expect this would be so much of a hassle, even if we managed to > implement it on the server side, it would be prohibitively complex for > most users. > > In addition, under the multirepo, you only get synchronized subproject > commits in your local checkout if you choose to use a git-submodules > based workflow. If you use the workflow that we currently have, then > on the client side, there is no guarantee that your subprojects will > be sync'ed. (This is the same as most peoples' client-side git > workflows today.) *Even if we manage to atomically commit across > subprojects*, that is of limited utility unless those commits show up > atomically on developers' workstations. But using a workflow based on > git-submodules is highly complex as compared to the monorepo -- this > was what I was trying to illustrate in my very first email on this > thread. > > When we say "the monorepo gets you atomic commits," that's an abbreviation for > > 1) The monorepo makes it far simpler to make atomic commits from git > as compared to the current SVN setup. > 2) Atomic commits are definitely possible in the monorepo. They are > theoretically possible in the multirepo, with extensive tooling etc. > 3) Under the basic monorepo workflow, your checkouts are always > correct with respect to atomic commits. Under the basic multirepo > workflow, this is not true -- you have to engage with git submodules > to get this property, and that is a giant pain. > > Sorry for the wall of text, but this is important. > > [1] https://github.com/jlebar/llvm-repo-tools. Be careful, I've only > made one commit with it so far. :) > > On Sat, Jul 30, 2016 at 10:38 PM, Robinson, Paul <paul.robinson at sony.com> wrote: >>> The only thing a monorepo gets you that strictly isn’t possible without >>> it is the ability to commit to multiple projects in a single commit. >>> Personally I don’t think that is a big enough justification, but that is >>> my opinion, not a fact. >> >> Okay, I just bumped into r277008, in which commits to llvm, clang, and >> clang-tools-extra all have the same SVN revision number. >> I don't know how it happened but it did. Is this just an artifact of >> how somebody pasted together a bunch of git-svn projects, or is it >> something that a top-level git repo with submodules would allow? >> And if it is, then the "only thing a monorepo gets you" isn't something >> that you need a monorepo to get. >> Your befuddled correspondent, >> --paulr >>
Mehdi Amini via llvm-dev
2016-Jul-31 18:03 UTC
[llvm-dev] [RFC] One or many git repositories?
> On Jul 30, 2016, at 10:38 PM, Robinson, Paul via llvm-dev <llvm-dev at lists.llvm.org> wrote: > >> The only thing a monorepo gets you that strictly isn’t possible without >> it is the ability to commit to multiple projects in a single commit. >> Personally I don’t think that is a big enough justification, but that is >> my opinion, not a fact. > > Okay, I just bumped into r277008, in which commits to llvm, clang, and > clang-tools-extra all have the same SVN revision number. > I don't know how it happened but it did. Is this just an artifact of > how somebody pasted together a bunch of git-svn projects, or is it > something that a top-level git repo with submodules would allow? > And if it is, then the "only thing a monorepo gets you" isn't something > that you need a monorepo to get.Nobody claimed that you need a monorepo to be able to get this on the absolute. Technically with a read-write umbrella repo, one could perform this kind of commit with the submodules proposal. The difference I see is that we trade (optional) complexity at checkout out time vs (not-so optional) complexity during development (day-to-day) time. In terms of manipulating the repo to achieve something, you can use all the “simple” git commands to handle the monorepo (`git branch` works uniformly for example). — Mehdi
Robinson, Paul via llvm-dev
2016-Jul-31 20:52 UTC
[llvm-dev] [RFC] One or many git repositories?
> -----Original Message----- > From: mehdi.amini at apple.com [mailto:mehdi.amini at apple.com] > Sent: Sunday, July 31, 2016 11:03 AM > To: Robinson, Paul > Cc: Chris Bieneman; Justin Lebar; llvm-dev at lists.llvm.org; Bruce Hoult > Subject: Re: [llvm-dev] [RFC] One or many git repositories? > > > > On Jul 30, 2016, at 10:38 PM, Robinson, Paul via llvm-dev <llvm- > dev at lists.llvm.org> wrote: > > > >> The only thing a monorepo gets you that strictly isn’t possible without > >> it is the ability to commit to multiple projects in a single commit. > >> Personally I don’t think that is a big enough justification, but that > is > >> my opinion, not a fact. > > > > Okay, I just bumped into r277008, in which commits to llvm, clang, and > > clang-tools-extra all have the same SVN revision number. > > I don't know how it happened but it did. Is this just an artifact of > > how somebody pasted together a bunch of git-svn projects, or is it > > something that a top-level git repo with submodules would allow? > > And if it is, then the "only thing a monorepo gets you" isn't something > > that you need a monorepo to get. > > Nobody claimed that you need a monorepo to be able to get this on theActually it looks to me like beanz said exactly that.> absolute. Technically with a read-write umbrella repo, one could perform > this kind of commit with the submodules proposal. > The difference I see is that we trade (optional) complexity at checkout > out time vs (not-so optional) complexity during development (day-to-day) > time. > In terms of manipulating the repo to achieve something, you can use all > the “simple” git commands to handle the monorepo (`git branch` works > uniformly for example).Right, I get that part, the question was whether the claim about "strictly isn't possible" was actually correct. You're saying the claim isn't correct but we don't want to operate a multi-repo that way, which is fine. --paulr> > — > Mehdi >