Daniel Sanders via llvm-dev
2016-Jul-25 13:55 UTC
[llvm-dev] [RFC] One or many git repositories?
> -----Original Message----- > From: Robinson, Paul [mailto:paul.robinson at sony.com] > Sent: 22 July 2016 18:50 > To: Renato Golin; Daniel Sanders > Cc: llvm-dev at lists.llvm.org > Subject: RE: [llvm-dev] [RFC] One or many git repositories? > > > >> * public and downstream forks that *rely* on linear history > > > > > > Do you have an example in mind? I'd expect them to rely on each > 'master' > > being > > > an improvement on 'master^'. I wouldn't expect them to be interested in > > how > > > 'master^' became 'master'. > > > > Paul Robinson was outlining some of the issues he had with git > > history. I don't know their setup, so I'll let him describe the issues > > (or he may have done so already in some thread, but I haven't read it > > all). > > Since you asked... > > The key point is that a (basically) linear upstream history makes it > feasible to do bisection on a downstream branch that mixes in a pile > of local changes, because the (basically) linear upstream history can > be merged into the downstream branch commit-by-commit which retains > the crucial linearity property. > > We have learned through experience that a bulk merge from upstream is > a Bad Idea(tm). Suppose we have a test that fails; it does not repro > with an upstream compiler; we try to bisect it; we discover that it > started after a bulk merge of 1000 commits from upstream. But we can't > bisect down the second-parent line of history, because that turns back > into a straight upstream compiler and the problem fails to repro. > > If instead we had rolled the 1000 commits into our repo individually, > we'd have a linear history mixing upstream with our stuff and we would > be able to bisect naturally. But that relies on the *upstream* history > being basically linear, because we can't pick apart an upstream commit > that is itself a big merge of lots of commits. At least I don't know how.I know of a way but it's not very nice. The gist of it is to checkout the downstream branch just before the bad merge and then merge the first 100 commits from upstream. If the result is good then merge the next 100, but if it's bad then 'git reset --hard' and merge 10 instead. You'll eventually find the commit that made it bad. Essentially, the idea is to make a throwaway branch that merges more frequently. I do something similar to rebase my work to master since gradually rebasing often causes all the conflicts to go away.> Now, I do say "basically" linear because the important thing is to have > small increments of change each time. It doesn't mean we have to have > everything be ff-only, and we can surely tolerate the merge commits that > wrap individual commits in a pull-request kind of workflow. But merges > that bring in long chains of commits are not what we want. > --paulrI agree that we should probably keep the history as close to linear as possible (mostly because I find the linux kernel's history difficult to follow) but it sounds like the issue is more about the content of the merge than the linearity of the history. A long-lived branch with a complex history sounds like it would be ok in your scenario if the eventual merge was a small change to master.
Renato Golin via llvm-dev
2016-Jul-25 14:10 UTC
[llvm-dev] [RFC] One or many git repositories?
On 25 July 2016 at 14:55, Daniel Sanders <Daniel.Sanders at imgtec.com> wrote:> I know of a way but it's not very nice. The gist of it is to checkout the > downstream branch just before the bad merge and then merge the first > 100 commits from upstream. If the result is good then merge the next > 100, but if it's bad then 'git reset --hard' and merge 10 instead. You'll > eventually find the commit that made it bad. Essentially, the idea is to > make a throwaway branch that merges more frequently. I do something > similar to rebase my work to master since gradually rebasing often > causes all the conflicts to go away.This is essentially what git-imerge does, you only need to define "good merge" in the form of a script or CI job. cheers, -renato
Robinson, Paul via llvm-dev
2016-Jul-25 14:12 UTC
[llvm-dev] [RFC] One or many git repositories?
> -----Original Message----- > From: Renato Golin [mailto:renato.golin at linaro.org] > Sent: Monday, July 25, 2016 7:11 AM > To: Daniel Sanders > Cc: Robinson, Paul; llvm-dev at lists.llvm.org > Subject: Re: [llvm-dev] [RFC] One or many git repositories? > > On 25 July 2016 at 14:55, Daniel Sanders <Daniel.Sanders at imgtec.com> > wrote: > > I know of a way but it's not very nice. The gist of it is to checkout > the > > downstream branch just before the bad merge and then merge the first > > 100 commits from upstream. If the result is good then merge the next > > 100, but if it's bad then 'git reset --hard' and merge 10 instead. > You'll > > eventually find the commit that made it bad. Essentially, the idea is to > > make a throwaway branch that merges more frequently. I do something > > similar to rebase my work to master since gradually rebasing often > > causes all the conflicts to go away. > > This is essentially what git-imerge does, you only need to define > "good merge" in the form of a script or CI job. > > cheers, > -renatoExcept I understood git-imerge to be looking for physical conflicts, not "when did this test start failing." If it does the latter also, that would be awesome. --paulr
Robinson, Paul via llvm-dev
2016-Jul-25 14:20 UTC
[llvm-dev] [RFC] One or many git repositories?
> -----Original Message----- > From: Daniel Sanders [mailto:Daniel.Sanders at imgtec.com] > Sent: Monday, July 25, 2016 6:56 AM > To: Robinson, Paul; Renato Golin > Cc: llvm-dev at lists.llvm.org > Subject: RE: [llvm-dev] [RFC] One or many git repositories? > > > > -----Original Message----- > > From: Robinson, Paul [mailto:paul.robinson at sony.com] > > Sent: 22 July 2016 18:50 > > To: Renato Golin; Daniel Sanders > > Cc: llvm-dev at lists.llvm.org > > Subject: RE: [llvm-dev] [RFC] One or many git repositories? > > > > > >> * public and downstream forks that *rely* on linear history > > > > > > > > Do you have an example in mind? I'd expect them to rely on each > > 'master' > > > being > > > > an improvement on 'master^'. I wouldn't expect them to be interested > in > > > how > > > > 'master^' became 'master'. > > > > > > Paul Robinson was outlining some of the issues he had with git > > > history. I don't know their setup, so I'll let him describe the issues > > > (or he may have done so already in some thread, but I haven't read it > > > all). > > > > Since you asked... > > > > The key point is that a (basically) linear upstream history makes it > > feasible to do bisection on a downstream branch that mixes in a pile > > of local changes, because the (basically) linear upstream history can > > be merged into the downstream branch commit-by-commit which retains > > the crucial linearity property. > > > > We have learned through experience that a bulk merge from upstream is > > a Bad Idea(tm). Suppose we have a test that fails; it does not repro > > with an upstream compiler; we try to bisect it; we discover that it > > started after a bulk merge of 1000 commits from upstream. But we can't > > bisect down the second-parent line of history, because that turns back > > into a straight upstream compiler and the problem fails to repro. > > > > If instead we had rolled the 1000 commits into our repo individually, > > we'd have a linear history mixing upstream with our stuff and we would > > be able to bisect naturally. But that relies on the *upstream* history > > being basically linear, because we can't pick apart an upstream commit > > that is itself a big merge of lots of commits. At least I don't know > how. > > I know of a way but it's not very nice. The gist of it is to checkout the > downstream branch just before the bad merge and then merge the first > 100 commits from upstream. If the result is good then merge the next > 100, but if it's bad then 'git reset --hard' and merge 10 instead. You'll > eventually find the commit that made it bad. Essentially, the idea is to > make a throwaway branch that merges more frequently. I do something > similar to rebase my work to master since gradually rebasing often > causes all the conflicts to go away.A very manual sort of bisection, but yeah that would get the job done.> > > Now, I do say "basically" linear because the important thing is to have > > small increments of change each time. It doesn't mean we have to have > > everything be ff-only, and we can surely tolerate the merge commits that > > wrap individual commits in a pull-request kind of workflow. But merges > > that bring in long chains of commits are not what we want. > > --paulr > > I agree that we should probably keep the history as close to linear as > possible > (mostly because I find the linux kernel's history difficult to follow) but > it sounds > like the issue is more about the content of the merge than the linearity > of > the history. A long-lived branch with a complex history sounds like it > would be > ok in your scenario if the eventual merge was a small change to master.I think I'd rather see such things squashed before they reach master, because a normal bisection might still be tempted down the garden path of the second-parent history. --paulr