David Chisnall via llvm-dev
2016-Jul-25 11:40 UTC
[llvm-dev] [RFC] One or many git repositories?
On 25 Jul 2016, at 12:09, Renato Golin <renato.golin at linaro.org> wrote:> > A pull request is a request for *comment*, and we currently express > that via a Phab review. If I'm comfortable with the patch, I commit > directly, if I want others opinions, I create a review. If we move as > you propose, I'd lose the second option.A pull request is a mechanism. The community can decide what the policy is.> I also don't think pre-commit review committers should be just relying > on the fast bots passing. There are many other issues like code > quality, formatting, tests, etc. that need to be done properly.I agree, but it’s much harder to automate that (though having clang-format automatically applied might be nice).> If we go this route, we'll be effectively flattening out everyone that > has commit access into one big pool, and that would be a massive > change in how our community operates (for better or worse).That’s not what I’m suggesting. Pull requests provide a convenient staging mechanism. On the back end, they’re simply git branches in a namespace that won’t be automatically fetched by clients that do a clone.> Unless you're also proposing that only the "core team" (code owners, > high committers, etc) get commit access in the new style.I’d suggest that no one should have direct push access (or, at least, the no one should *use* the direct access - presumably some people in what GitHub calls the admin groups would retain access for dealing with problems).>> The workflow would be exactly the same for committers and non-committers, except that committers would be able to tag their own pull requests as ready to merge. > > Can this be controlled via a server hook? One that would be supported by GitHub?GitHub supports everything that is needed to implement this suggestion. From the developers’ perspective, the workflow is: - User sends pull request. - User from a list of people that correspond to our current set of committers labels the request as ‘approved’ (GitHub labels can only be applied by members of the organisation). In the case of committers who are committing to areas where they are comfortable with post-commit review, these two users can be the same person, so you’d send a pull request and add the tag yourself. The buildbot master then: - Collects all of the pull requests (git fetch) - Uses the GitHub API to filter the ones that have been approved - Merges these, in turn, to its local master branch - Sends those changes to the builders - If they succeed, pushes master to GitHub In idle time (when not busy with approved patches), it should also try other pull requests and add a label saying that they’ve passed the tests. There are some obvious refinements to this, for example only kicking the fast buildbots when there are several patches outstanding and then running the slower ones with the combined master, then backtracking and applying them one at a time if this fails. Longer term, it would be nice to integrate things like running the static analyser into this workflow. I don’t have particularly strong feelings about this, I just wanted to point out that it’s possible with GitHub and a little bit of client code talking to the GitHub APIs. There’s a slightly more complex workflow for people to push code, but in exchange for that we’d never be in a state where anything pushed to master introduces regressions for which tests exist. David -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 3719 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160725/dc741896/attachment.bin>
Renato Golin via llvm-dev
2016-Jul-25 12:34 UTC
[llvm-dev] [RFC] One or many git repositories?
On 25 July 2016 at 12:40, David Chisnall <david.chisnall at cl.cam.ac.uk> wrote:> I’d suggest that no one should have direct push accessThat's what I meant by "massive change". :) I don't think that's a bad model, but it's a very different model than we currently have, so would need more discussion on that side before we can even think about the technical details.> - User sends pull request. > - User from a list of people that correspond to our current set of committers labels the request as ‘approved’ (GitHub labels can only be applied by members of the organisation). > The buildbot master then: > - Collects all of the pull requests (git fetch) > - Uses the GitHub API to filter the ones that have been approved > - Merges these, in turn, to its local master branch > - Sends those changes to the builders > - If they succeed, pushes master to GitHubThis sounds a lot like the Gerrit model. Can buildbots do that? How much code would be needed? Wouldn't it be easier to do this in Jenkins (push based)?> In idle time (when not busy with approved patches), it should also try other pull requests and add a label saying that they’ve passed the tests.This sounds like an added complexity to the buildbot code. (priority queues?) Not impossible, maybe not even hard, but yet another small thing to get right...> There are some obvious refinements to this, for example only kicking the fast buildbots when there are several patches outstanding and then running the slower ones with the combined master, then backtracking and applying them one at a time if this fails.If this is going to work, we'll need to run a single pull request on top of the correct "master" for every change. IIUC, GitHub will "trigger" a CI change for every *update* on every pull request, too. So we'll end up with a lot more half-baked patches than usually go through the current buildbots. This will mean more builders per bot (I'm expecting at least 3) and a silent bots (we can use 8014 for that).> I don’t have particularly strong feelings about this, I just wanted to point out that it’s possible with GitHub and a little bit of client code talking to the GitHub APIs. There’s a slightly more complex workflow for people to push code, but in exchange for that we’d never be in a state where anything pushed to master introduces regressions for which tests exist.I'm a big advocate of pre-commit tests and the Gerrit model, I think it makes developers more aware of their own failures (which bbots do) without upsetting the rest of the community (which bbots don't). For instance, in this model, I only need to review code that has been marked by the pre-commit bots as "good", so it also saves a lot of time from code reviewers, in addition to buildbot owners, and it will make the releases, back-porting, bisecting, *much* easier by the massive reduction in silly reverts. If I understand correctly, the model you propose is: 1. Admins: people with direct commit access used only for fixing the repo, fork releases, etc. 2. Pre-review committers: people with the right to create a "auto-merge" tag on pull reviews, but not commit directly. 3. Post-review committers: just pull request access, need to wait for someone in (2) to approve 4. Everyone else: just pull request access, need to wait for someone in (2) to approve A few problems with this: A. How do we divide the current (fuzzy) "groups" into those well defined groups? B. (2) folks will have the choice to get reviews, (3) won't have the choice to quickly fix their commits (which some do today). C. The migration between (3) to (2) is not defined *at all* today, and this is where the most radical cut will be. Post-review commit access is, as I understand, a social recognition for those striving to become more active committers in the future. By flattening (3) and (4), any threshold we pick will be "wrong" in some form, and may discourage people from contributing changes, or striving to become pre-review committers. I'm not saying you're wrong, or that I don't agree with you, but this won't be an easy sell... cheers, --renato
David Chisnall via llvm-dev
2016-Jul-25 13:29 UTC
[llvm-dev] [RFC] One or many git repositories?
> On 25 Jul 2016, at 13:34, Renato Golin <renato.golin at linaro.org> wrote: > > On 25 July 2016 at 12:40, David Chisnall <david.chisnall at cl.cam.ac.uk> wrote: >> I’d suggest that no one should have direct push access > > That's what I meant by "massive change". :) > > I don't think that's a bad model, but it's a very different model than > we currently have, so would need more discussion on that side before > we can even think about the technical details. > > >> - User sends pull request. >> - User from a list of people that correspond to our current set of committers labels the request as ‘approved’ (GitHub labels can only be applied by members of the organisation). >> The buildbot master then: >> - Collects all of the pull requests (git fetch) >> - Uses the GitHub API to filter the ones that have been approved >> - Merges these, in turn, to its local master branch >> - Sends those changes to the builders >> - If they succeed, pushes master to GitHub > > This sounds a lot like the Gerrit model. > > Can buildbots do that? How much code would be needed? > > Wouldn't it be easier to do this in Jenkins (push based)?These are both good questions for someone more familiar with the Buildbot infrastructure than me.> > >> In idle time (when not busy with approved patches), it should also try other pull requests and add a label saying that they’ve passed the tests. > > This sounds like an added complexity to the buildbot code. (priority > queues?) Not impossible, maybe not even hard, but yet another small > thing to get right…Not hard - there are two queues (stuff that’s ready to merge, stuff awaiting review) and you only start on the second queue when the first one is empty.>> There are some obvious refinements to this, for example only kicking the fast buildbots when there are several patches outstanding and then running the slower ones with the combined master, then backtracking and applying them one at a time if this fails. > > If this is going to work, we'll need to run a single pull request on > top of the correct "master" for every change. > > IIUC, GitHub will "trigger" a CI change for every *update* on every > pull request, too. So we'll end up with a lot more half-baked patches > than usually go through the current buildbots.Yes, GitHub can do this on a push model, but the first filter would be ‘does this have the correct label’, and if not then it goes on the low-priority pile.> > This will mean more builders per bot (I'm expecting at least 3) and a > silent bots (we can use 8014 for that). > > >> I don’t have particularly strong feelings about this, I just wanted to point out that it’s possible with GitHub and a little bit of client code talking to the GitHub APIs. There’s a slightly more complex workflow for people to push code, but in exchange for that we’d never be in a state where anything pushed to master introduces regressions for which tests exist. > > I'm a big advocate of pre-commit tests and the Gerrit model, I think > it makes developers more aware of their own failures (which bbots do) > without upsetting the rest of the community (which bbots don't). > > For instance, in this model, I only need to review code that has been > marked by the pre-commit bots as "good", so it also saves a lot of > time from code reviewers, in addition to buildbot owners, and it will > make the releases, back-porting, bisecting, *much* easier by the > massive reduction in silly reverts.These are the advantages. The disadvantages are: - Someone needs to write the Buildbot integration code - Committers need to adopt a slightly more complex workflow (though not more complex than going via phabricator)> If I understand correctly, the model you propose is: > > 1. Admins: people with direct commit access used only for fixing the > repo, fork releases, etc.Yup.> 2. Pre-review committers: people with the right to create a > "auto-merge" tag on pull reviews, but not commit directly.Yup.> 3. Post-review committers: just pull request access, need to wait for > someone in (2) to approveYup.> 4. Everyone else: just pull request access, need to wait for someone > in (2) to approveThere’s not really a difference between 3 and 4 in this model.> > A few problems with this: > > A. How do we divide the current (fuzzy) "groups" into those well defined groups?Group 4 is easy, that’s everyone that we don’t actively put in a group. I have no idea about group 1. This set roughly corresponds today to the set of people who are Chris, and should probably be half a dozen people for a bit more redundancy (admins are also the people who can add people to the GitHub group, create new repos, and so on). It might be something overseen by the foundation. For groups 2/3 as they map to current committers, the same way that we do now: Give people membership in the group and let them know what the community expects them to have reviewed pre-commit. If they start committing things that should have been reviewed without proper review and they don’t change their behaviour, then move them back into group 4.> B. (2) folks will have the choice to get reviews, (3) won't have the > choice to quickly fix their commits (which some do today). > C. The migration between (3) to (2) is not defined *at all* today, and > this is where the most radical cut will be.If the difference is purely cultural and not mechanically enforced, then (3) will be able to fix their commits by labelling them as approved. I don’t see a need for a mechanical separation between groups 2 and 3. In FreeBSD we have a concept of mentorship, where new committers must have ‘Approved by: {name} (mentor)’ in all of their commits and their mentor is responsible for ensuring that they are following the expectations of the community. This is similarly not enforced by the revision control system (though, again, people who don’t do it may find that they can no longer commit).> Post-review commit access is, as I understand, a social recognition > for those striving to become more active committers in the future. By > flattening (3) and (4), any threshold we pick will be "wrong" in some > form, and may discourage people from contributing changes, or striving > to become pre-review committers.I’m not proposing flattening 3 and 4.> I'm not saying you're wrong, or that I don't agree with you, but this > won't be an easy sell…It’s also worth pointing out that this is in no way required to happen at the same time as the GitHub migration (or at all), so should not delay that. Just something for people to think about... David