thr3ads.net - llvm dev - [LLVMdev] svn mirror git? [Nov 2012]

If this information is useful, please help other people find it:
Share via:

Greg Fitzgerald

2012-Nov-16 21:53 UTC

[LLVMdev] svn mirror git?

LLVM Community,
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-July/041738.html
This was extraordinarily valuable in learning to understand the
situation - thank you David Blaikie for pointing me to it.


A few key snippets:

"Because I optimize for the code reviewer, not the patch submitter,"
Chris Lattner

"Forcing transitioning to git makes no sense for a lot of us - for
example, we have lots of scripts that depend on svn revision numbers,"
Jason Kim

"Let me say this again: We are not fundamentally changing the
development policy around LLVM," Chris Lattner


My interpretations, which later in this long email, I'll assume as
premises to a recommended action:

* Chris finds code reviewers to be exceptionally rare and the
community's most valuable participants.  My previous "spork"
suggestion would be a decision made my maintainers, not influenced by
patch-contributors, and would only happen if the maintainers felt the
transition made it easier for them to review and/or commit patches.

* Dropping SVN would be expensive for some.  Instead of dropping SVN,
it is more reasonable to make git the central repo and have SVN mirror
git.

* A linear history is highly valued by Chris and many members of the community.


My input (or from my perspective, my output):

In my humble opinion, there is a one biggest problem with git-svn and
svn.  It requires the maintainer to rebase before committing, and in
git, this changes the the patch's unique ID.  Changing the ID creates
a serious problem, one which forces the private fork to make an early
decision about contributing back to the community.  The private fork
must decide, "do we want this patch today or would we rather wait for
it to come in through a "fast-forward" of the community's
repository?"
 If we choose to accept the patch locally, we have another decision to
make, "do we want to deal with merge conflicts after the patch makes
it through the community's review process, or should we just keep it
private and enjoy easy automatic merges until the community eventually
finds the same bug and redundantly makes a similar fix?"  I hope you
see this as not a Good Thing for the community.  The policy of
rebasing provides private forks incentive *not* to contribute patches.
 Please oh please, do not reply saying "but that's just selfish." 
The
point I am hoping to illustrate is only that this incentive exists,
and it is a consequence of policy.

However, one could argue that the same policy, to always rebase,
provides incentive not to fork at all.  That is, it is easier to
contribute to the community than to make a private patch and risk
merge conflicts.  Indeed, but one problem, a fact of software:  The
private fork of any project will always and only exist as a mechanism
to meet functional requirements and/or schedule that do not align with
the official "mainline".  More concretely, if I have an upcoming
release planned and have a bug-fix that affects the correctness of the
compiler, I will most certainly add it to my private fork and not wait
on a community review.  At this point, I actually have incentive to
stop the code review process and hope the community never finds and
fixes my bug.  My life is easier when I choose not to contribute, and
this a direct consequence of the policy decision to rebase instead of
merge.

But rebasing is fundamental in providing a linear history, right?  I
question the validity of this popular argument, and argue this is just
a tooling issue.  The very fact that a rebase can often be achieved
automatically and without conflicts should send a strong signal that
this may be true.  At it turns out, the git object tree does encode a
linear history.  But this is not obvious!  "git log" makes an awkward
design decision in ordering commits by date.  Instead, I think it
should be ordered by merge, or specifically, a pre-order, depth-first
traversal of the commit tree.  I believe people care more about when
the patch entered their own repository than when the author made the
commit to his or hers.


Proposal: a slow, multistep, backward-compatible transition to remove
the disincentive to contribute patches from private forks:

Step 1: Demonstrate "git log" or a similar tool can produce a linear
history in the presence of merging.  This may already be possible.

Step 2: Swap the roles of git and svn.  Make svn the mirror and git
the central repository, and update the online documentation
accordingly.  In this step, do not change any policies and demand
anyone with commit access to maintain a linear history.  This
restriction is necessary for the svn mirror, but aims to give everyone
with svn-dependencies a strong hint that LLVM's use of svn is on its
way out.

Step 3: Once all svn automation dependencies have been dropped,
discontinue the svn mirror.  Relax the "always rebase" policy and ask
code-owners to start preferring merges to rebasing.


If the community is willing to make this transition, I commit to
coordinating a worldwide decentralized party celebrating our
successful move to decentralized version control.

Thank you for your time,
Greg Fitzgerald

P.S. tl;dr, right?

Joerg Sonnenberger

2012-Nov-16 22:10 UTC

head link

[LLVMdev] svn mirror git?

On Fri, Nov 16, 2012 at 01:53:12PM -0800, Greg Fitzgerald
wrote:> In my humble opinion, there is a one biggest problem with git-svn and
> svn.  It requires the maintainer to rebase before committing, and in
> git, this changes the the patch's unique ID.  Changing the ID creates
> a serious problem, one which forces the private fork to make an early
> decision about contributing back to the community.  The private fork
> must decide, "do we want this patch today or would we rather wait for
> it to come in through a "fast-forward" of the community's
repository?"
I fully agree with your analysis and to me, it strongly sounds like
"we should switch to git because git rebase doesn't work properly"
or
however else you want to call it.

Joerg

dag at cray.com

2012-Nov-16 22:30 UTC

head link

[LLVMdev] svn mirror git?

Greg Fitzgerald <garious at gmail.com> writes:
> In my humble opinion, there is a one biggest problem with git-svn and
> svn.  It requires the maintainer to rebase before committing, and in
> git, this changes the the patch's unique ID.
I didn't totally follow your argument so I'm sure I missed something.

However, I don't think rebase is really the issue here.  Linux kernel
developers rebase all the time.  It's required before merging to
mainline.  Of course at the point of the merge your feature branch
should go away so the rebase is moot.

I _think_ the "rebase" problem you describe is more to do with an
innappropriate use of git - keeping branches around through multiple
merges.  I totally understand the attraction of doing that but with a
rebase policy (common in the git world) it's going to cause issues of
the type your described.  Still, the merge often works just fine even in
the presence of the rebase - git is often smart enough to recognize when
a commit is already applied locally.

The git-svn argument has been settled a while ago, so best not to stir
that pot right now.  But moving to git should _not_ mean we drop linear
history.  There is no reason it must.

                          -David

Greg Fitzgerald

2012-Nov-16 23:45 UTC

head link

[LLVMdev] svn mirror git?

David A. Green wrote:> However, I don't think rebase is really the issue here.
Thanks David.  After a bit of experimentation, I see you are quite
right, my "rebase" problem is actually a "squash" problem. 
As it
turns out, every time I've rebased, and then a later pull caused a
manual merge, it was because I had also squashed in changes from a
code review.

So here's the problematic workflow:

1) fork mainline
2) add patch to fork
3) submit patch to mainline for review
4) patch the patch as part of review process
5) squash "patch of patch" into "patch"
6) mainline accepts squashed patch
7) merge into private fork, and kaboom!

In this workflow, git can't possibly know whether I want the patch or
patched-patch.  They represent different solutions to the same
problem.

An automation-friendly workflow:

4.5) submit "patch of patch" to private fork.

Then, when you merge, the history will include your original patch,
the patched-patch, and the squashed patch, and a 'merge' patch that
tells git that little mess is no problem.

I retract my proposal and apologize for the noise.  I guess I'll have
to find some other reason to throw a big party.  Suggestions welcome.
:-)

-Greg

On Fri, Nov 16, 2012 at 2:30 PM,  <dag at cray.com>
wrote:> Greg Fitzgerald <garious at gmail.com> writes:
>
>> In my humble opinion, there is a one biggest problem with git-svn and
>> svn.  It requires the maintainer to rebase before committing, and in
>> git, this changes the the patch's unique ID.
>
> I didn't totally follow your argument so I'm sure I missed
something.
>
> However, I don't think rebase is really the issue here.  Linux kernel
> developers rebase all the time.  It's required before merging to
> mainline.  Of course at the point of the merge your feature branch
> should go away so the rebase is moot.
>
> I _think_ the "rebase" problem you describe is more to do with an
> innappropriate use of git - keeping branches around through multiple
> merges.  I totally understand the attraction of doing that but with a
> rebase policy (common in the git world) it's going to cause issues of
> the type your described.  Still, the merge often works just fine even in
> the presence of the rebase - git is often smart enough to recognize when
> a commit is already applied locally.
>
> The git-svn argument has been settled a while ago, so best not to stir
> that pot right now.  But moving to git should _not_ mean we drop linear
> history.  There is no reason it must.
>
>                           -David

Chandler Carruth

2012-Nov-19 11:43 UTC

head link

[LLVMdev] svn mirror git?

On Fri, Nov 16, 2012 at 1:53 PM, Greg Fitzgerald <garious at gmail.com>
wrote:> My interpretations, which later in this long email, I'll assume as
> premises to a recommended action:
>
> * Chris finds code reviewers to be exceptionally rare and the
> community's most valuable participants.  My previous "spork"
> suggestion would be a decision made my maintainers, not influenced by
> patch-contributors, and would only happen if the maintainers felt the
> transition made it easier for them to review and/or commit patches.
>
> * Dropping SVN would be expensive for some.  Instead of dropping SVN,
> it is more reasonable to make git the central repo and have SVN mirror
> git.
>
> * A linear history is highly valued by Chris and many members of the
community.
You missed what is (IMO) the most important point: LLVM's development
process and VCS optimize for active developers working in the open on
mainline LLVM. They don't optimize for private forks or other
development processes. This isn't an accident, and it helps
incentivize members of the community to contribute early and in small,
incremental patches.
> Changing the ID creates
> a serious problem, one which forces the private fork to make an early
> decision about contributing back to the community.
The fact that this (and all of the related and restated problems you
and others have outlined since this email) is predicated on a private
fork is why it isn't a priority for the process. If you instead make
the early decision to contribute small, incremental patches, then this
is not a problem.

I'm not claiming it is not a problem as a hypothetical claim which I
haven't tested. There are numerous groups (including mine) working
with LLVM without any problem due to this. There are even several that
*do* have some code which doesn't go upstream, and they also are not
thwarted by this.

<snip>
> Proposal: a slow, multistep, backward-compatible transition to remove
> the disincentive to contribute patches from private forks:
I strongly doubt that this is the primary barrier for the contribution
of such patches. Code review, the fact that these patches have
accreted for long periods of time outside the view of the community,
and a lack (or broken nature) of incremental development processes
seem likely to cost much more.
> P.S. tl;dr, right?
Actually, yes. This entire conversation is too long to read.

Many are claiming there is something wrong with the use of Subversion.
And yet, despite these "problems", LLVM and Clang are among the
fastest growing and most active open source projects I have had the
pleasure of working on. Also, the most active members of the community
(who should be hitting these problems the most often) are never the
ones crying for change. I suggest folks instead work to demonstrate
the scaling problems of Subversion by contributing ever more rapidly
to LLVM. Perhaps we will discover the problem, but either way LLVM
will improve by leaps and bounds. ;] Everybody wins.

Óscar Fuentes

2012-Nov-19 13:09 UTC

head link

[LLVMdev] svn mirror git?

Chandler Carruth <chandlerc at google.com> writes:
> You missed what is (IMO) the most important point: LLVM's development
> process and VCS optimize for active developers working in the open on
> mainline LLVM. They don't optimize for private forks or other
> development processes. This isn't an accident, and it helps
> incentivize members of the community to contribute early and in small,
> incremental patches.
It is desirable to incentivize members to contribute early and in small,
incremental patches?

The impression I got from all those years using and (occasionally)
watching the development of LLVM is that it is prone to incorporate
non-functional or deficient code coming from half-done or tentative
projects. That code is removed sooner or later, but it creates noise for
both developers and users. Sometimes it is destabilizing too.

Then, it is a bit hard to understand the preference some people show for
reviewing the introduction of a feature as a series of patches
introducing non-definitive code, distributed along a significant time
and mixed in the same queue with dozens of unrelated patches, instead of
a cohesive, well-defined and ready-to-run series of patches that clearly
implements the advertised feature. The only explanation I can think of
is a mind-state created by old development styles, when CVS-style tools
(like svn) were the only option and accepting any sizable code
contribution was a nightmare.

[snip]

Seemingly Similar Threads

Search for more maybe matching threads

llvm dev - Nov 2012 - [LLVMdev] svn mirror git?

[LLVMdev] svn mirror git?

[LLVMdev] svn mirror git?

[LLVMdev] svn mirror git?

[LLVMdev] svn mirror git?

[LLVMdev] svn mirror git?

[LLVMdev] svn mirror git?

Seemingly Similar Threads