thr3ads.net - llvm dev - [llvm-dev] [RFC] One or many git repositories? [Jul 2016]

If this information is useful, please help other people find it:
Share via:

Daniel Sanders via llvm-dev

2016-Jul-25 13:55 UTC

[llvm-dev] [RFC] One or many git repositories?

> -----Original Message-----
> From: Robinson, Paul [mailto:paul.robinson at sony.com]
> Sent: 22 July 2016 18:50
> To: Renato Golin; Daniel Sanders
> Cc: llvm-dev at lists.llvm.org
> Subject: RE: [llvm-dev] [RFC] One or many git repositories?
> 
> > >> * public and downstream forks that *rely* on linear history
> > >
> > > Do you have an example in mind? I'd expect them to rely on
each
> 'master'
> > being
> > > an improvement on 'master^'. I wouldn't expect them
to be interested in
> > how
> > > 'master^' became 'master'.
> >
> > Paul Robinson was outlining some of the issues he had with git
> > history. I don't know their setup, so I'll let him describe
the issues
> > (or he may have done so already in some thread, but I haven't read
it
> > all).
> 
> Since you asked...
> 
> The key point is that a (basically) linear upstream history makes it
> feasible to do bisection on a downstream branch that mixes in a pile
> of local changes, because the (basically) linear upstream history can
> be merged into the downstream branch commit-by-commit which retains
> the crucial linearity property.
> 
> We have learned through experience that a bulk merge from upstream is
> a Bad Idea(tm).  Suppose we have a test that fails; it does not repro
> with an upstream compiler; we try to bisect it; we discover that it
> started after a bulk merge of 1000 commits from upstream.  But we can't
> bisect down the second-parent line of history, because that turns back
> into a straight upstream compiler and the problem fails to repro.
> 
> If instead we had rolled the 1000 commits into our repo individually,
> we'd have a linear history mixing upstream with our stuff and we would
> be able to bisect naturally.  But that relies on the *upstream* history
> being basically linear, because we can't pick apart an upstream commit
> that is itself a big merge of lots of commits. At least I don't know
how.
I know of a way but it's not very nice. The gist of it is to checkout the
downstream branch just before the bad merge and then merge the first
100 commits from upstream. If the result is good then merge the next
100, but if it's bad then 'git reset --hard' and merge 10 instead.
You'll
eventually find the commit that made it bad. Essentially, the idea is to
make a throwaway branch that merges more frequently. I do something
similar to rebase my work to master since gradually rebasing often
causes all the conflicts to go away.
> Now, I do say "basically" linear because the important thing is
to have
> small increments of change each time.  It doesn't mean we have to have
> everything be ff-only, and we can surely tolerate the merge commits that
> wrap individual commits in a pull-request kind of workflow.  But merges
> that bring in long chains of commits are not what we want.
> --paulr
I agree that we should probably keep the history as close to linear as possible 
(mostly because I find the linux kernel's history difficult to follow) but
it sounds
like the issue is more about the content of the merge than the linearity of
the history. A long-lived branch with a complex history sounds like it would be
ok in your scenario if the eventual merge was a small change to master.

Renato Golin via llvm-dev

2016-Jul-25 14:10 UTC

head link

[llvm-dev] [RFC] One or many git repositories?

On 25 July 2016 at 14:55, Daniel Sanders <Daniel.Sanders at imgtec.com>
wrote:> I know of a way but it's not very nice. The gist of it is to checkout
the
> downstream branch just before the bad merge and then merge the first
> 100 commits from upstream. If the result is good then merge the next
> 100, but if it's bad then 'git reset --hard' and merge 10
instead. You'll
> eventually find the commit that made it bad. Essentially, the idea is to
> make a throwaway branch that merges more frequently. I do something
> similar to rebase my work to master since gradually rebasing often
> causes all the conflicts to go away.
This is essentially what git-imerge does, you only need to define
"good merge" in the form of a script or CI job.

cheers,
-renato

Robinson, Paul via llvm-dev

2016-Jul-25 14:12 UTC

head link

[llvm-dev] [RFC] One or many git repositories?

> -----Original Message-----
> From: Renato Golin [mailto:renato.golin at linaro.org]
> Sent: Monday, July 25, 2016 7:11 AM
> To: Daniel Sanders
> Cc: Robinson, Paul; llvm-dev at lists.llvm.org
> Subject: Re: [llvm-dev] [RFC] One or many git repositories?
> 
> On 25 July 2016 at 14:55, Daniel Sanders <Daniel.Sanders at
imgtec.com>
> wrote:
> > I know of a way but it's not very nice. The gist of it is to
checkout
> the
> > downstream branch just before the bad merge and then merge the first
> > 100 commits from upstream. If the result is good then merge the next
> > 100, but if it's bad then 'git reset --hard' and merge 10
instead.
> You'll
> > eventually find the commit that made it bad. Essentially, the idea is
to
> > make a throwaway branch that merges more frequently. I do something
> > similar to rebase my work to master since gradually rebasing often
> > causes all the conflicts to go away.
> 
> This is essentially what git-imerge does, you only need to define
> "good merge" in the form of a script or CI job.
> 
> cheers,
> -renato
Except I understood git-imerge to be looking for physical conflicts,
not "when did this test start failing."  If it does the latter also,
that would be awesome.
--paulr

Robinson, Paul via llvm-dev

2016-Jul-25 14:20 UTC

head link

[llvm-dev] [RFC] One or many git repositories?

> -----Original Message-----
> From: Daniel Sanders [mailto:Daniel.Sanders at imgtec.com]
> Sent: Monday, July 25, 2016 6:56 AM
> To: Robinson, Paul; Renato Golin
> Cc: llvm-dev at lists.llvm.org
> Subject: RE: [llvm-dev] [RFC] One or many git repositories?
> 
> 
> > -----Original Message-----
> > From: Robinson, Paul [mailto:paul.robinson at sony.com]
> > Sent: 22 July 2016 18:50
> > To: Renato Golin; Daniel Sanders
> > Cc: llvm-dev at lists.llvm.org
> > Subject: RE: [llvm-dev] [RFC] One or many git repositories?
> >
> > > >> * public and downstream forks that *rely* on linear
history
> > > >
> > > > Do you have an example in mind? I'd expect them to rely
on each
> > 'master'
> > > being
> > > > an improvement on 'master^'. I wouldn't expect
them to be interested
> in
> > > how
> > > > 'master^' became 'master'.
> > >
> > > Paul Robinson was outlining some of the issues he had with git
> > > history. I don't know their setup, so I'll let him
describe the issues
> > > (or he may have done so already in some thread, but I haven't
read it
> > > all).
> >
> > Since you asked...
> >
> > The key point is that a (basically) linear upstream history makes it
> > feasible to do bisection on a downstream branch that mixes in a pile
> > of local changes, because the (basically) linear upstream history can
> > be merged into the downstream branch commit-by-commit which retains
> > the crucial linearity property.
> >
> > We have learned through experience that a bulk merge from upstream is
> > a Bad Idea(tm).  Suppose we have a test that fails; it does not repro
> > with an upstream compiler; we try to bisect it; we discover that it
> > started after a bulk merge of 1000 commits from upstream.  But we
can't
> > bisect down the second-parent line of history, because that turns back
> > into a straight upstream compiler and the problem fails to repro.
> >
> > If instead we had rolled the 1000 commits into our repo individually,
> > we'd have a linear history mixing upstream with our stuff and we
would
> > be able to bisect naturally.  But that relies on the *upstream*
history
> > being basically linear, because we can't pick apart an upstream
commit
> > that is itself a big merge of lots of commits. At least I don't
know
> how.
> 
> I know of a way but it's not very nice. The gist of it is to checkout
the
> downstream branch just before the bad merge and then merge the first
> 100 commits from upstream. If the result is good then merge the next
> 100, but if it's bad then 'git reset --hard' and merge 10
instead. You'll
> eventually find the commit that made it bad. Essentially, the idea is to
> make a throwaway branch that merges more frequently. I do something
> similar to rebase my work to master since gradually rebasing often
> causes all the conflicts to go away.
A very manual sort of bisection, but yeah that would get the job done.
> 
> > Now, I do say "basically" linear because the important thing
is to have
> > small increments of change each time.  It doesn't mean we have to
have
> > everything be ff-only, and we can surely tolerate the merge commits
that
> > wrap individual commits in a pull-request kind of workflow.  But
merges
> > that bring in long chains of commits are not what we want.
> > --paulr
> 
> I agree that we should probably keep the history as close to linear as
> possible
> (mostly because I find the linux kernel's history difficult to follow)
but
> it sounds
> like the issue is more about the content of the merge than the linearity
> of
> the history. A long-lived branch with a complex history sounds like it
> would be
> ok in your scenario if the eventual merge was a small change to master.
I think I'd rather see such things squashed before they reach master,
because a normal bisection might still be tempted down the garden path
of the second-parent history.
--paulr

Maybe Matching Threads

Search for more seemingly similar threads

llvm dev - Jul 2016 - [RFC] One or many git repositories?

[llvm-dev] [RFC] One or many git repositories?

[llvm-dev] [RFC] One or many git repositories?

[llvm-dev] [RFC] One or many git repositories?

[llvm-dev] [RFC] One or many git repositories?

Maybe Matching Threads