Joachim Durchholz via llvm-dev
2016-Feb-25 12:46 UTC
[llvm-dev] RFC: Move the test-suite LLVM project to GitHub?
Am 25.02.2016 um 12:33 schrieb Renato Golin via llvm-dev:> Kristof, Chandler, > > I think most of the responses seem favourable of the move, the > concerns being which Git repo we'll use (GitHub, GitLab, BitBucket), > but they're essentially identical on the git side.I'm repeating myself but not they are not identical on the git side. Not for people without push access to the main repo anyway. There's the GitHub model where you need to set up a remote repository, and the non-GitHub model where you push to a branch within the master repository. Obviously, the non-GitHub model needs to protect people from messing with other people's branches, particularly not with the master branch, but that's pretty much a solved problem.> Infrastructure decisions will need to be taken into account, but that > doesn't interfere with the "how we commit" discussion in any way.Regular committers with write access to repositories have a different workflow than occasional contributors who send patches or pull requests. You need to consider all values of "we" that you are interested in getting work from.> I believe all those services use something similar to git-new-workdir, > so even if we have 100 forks of the test-suite, we won't have 100 x > 1GB of used space.GitHub explicitly confirms that "forking" (remotely cloning) a repo does not copy the data. > But if we move into a "less owners" scenario, we> will penalise them with pull-requests all the time.What workflow are you comparing this to, if a pull request is a penalty?> This is a little more complex. SVN is very conservative on history, > and that saves us from destroying the origin. Git, on the other hand, > allows anyone with write access to completely wipe out the repo.This depends very much on repo configuration. For git itself, write access is roughly equivalent to svn administrator access. However, all public-writable repos have an authorization layer that prevents history from ever being wiped. Take a look at gitolite, that's the standard tool for managing such a layer.> If anyone with more git experience than me can come up with a safe way > to have 100s of committers pushing to master, I'd be happy to know.Nobody is allowing 100s of committers to push to master, that would be silly. You push to either a remote clone (GitHub) or a remote branch (GitLab), and the master's owner does the merge. These are essentially an alternative to sending patches by mail, with the added bonus that the Git* websites give you a forum website where you can open threads on individual lines of code, including an email gateway. Regards, Jo
Renato Golin via llvm-dev
2016-Feb-25 13:41 UTC
[llvm-dev] RFC: Move the test-suite LLVM project to GitHub?
On 25 February 2016 at 12:46, Joachim Durchholz via llvm-dev <llvm-dev at lists.llvm.org> wrote:> I'm repeating myself but not they are not identical on the git side. Not for > people without push access to the main repo anyway.We're talking about the people with commit access to the main repo. No commit access still needs someone with commit access to push/merge.> Regular committers with write access to repositories have a different > workflow than occasional contributors who send patches or pull requests. > You need to consider all values of "we" that you are interested in getting > work from.Practically, if we don't want to change *anything*, non-committers can have the same process they have today when submitting patches. GitHub/Lab will just be yet another source repository.> GitHub explicitly confirms that "forking" (remotely cloning) a repo does not > copy the data.Though so.> What workflow are you comparing this to, if a pull request is a penalty?Today, 100s of people commit directly. In a GitHub style, 100s of people will have to wait for a merge from a few folks. Those few folks will have the additional job of merging lots of patches. It's more like GCC, where maintainers are gate-keepers, and not one we have used so far. I'm ok with it, but other people might not.> Nobody is allowing 100s of committers to push to master, that would be > silly.My personal preference is to not do that. I can't even cope with myself pushing to master from different repositories, let alone 100s of people. But I take it everyone else is indicating that would be the way forward. I haven't mastered git enough to know the pros and cons, but you seem certain. Can you share your thoughts on the subject?> You push to either a remote clone (GitHub) or a remote branch (GitLab), and > the master's owner does the merge. These are essentially an alternative to > sending patches by mail, with the added bonus that the Git* websites give > you a forum website where you can open threads on individual lines of code, > including an email gateway.I think we all know the benefits of using GitHub/Lab, but we already have Phabricator that does a similar way and kind of cope with it being on a separate repository. Some people love Phab and Arc, I don't particularly like it myself and do prefer the GitHub style, but I'm just one guy. The main reason to move to GitHub/Lab is one of cost: storage, bandwidth and uptime, not one of tools. Even if we end up using the GitHub interface in the future, I think we should consider a less radical move first. Of course, if the problems of moving to git but still following our style becomes prohibitive, I think we should move to GitHub and use their style, as IMHO, the cost argument is stronger than the tooling. cheers, --renato
Joachim Durchholz via llvm-dev
2016-Feb-25 15:01 UTC
[llvm-dev] RFC: Move the test-suite LLVM project to GitHub?
Am 25.02.2016 um 14:41 schrieb Renato Golin:> On 25 February 2016 at 12:46, Joachim Durchholz via llvm-dev >> What workflow are you comparing this to, if a pull request is a penalty? > > Today, 100s of people commit directly.Ah. I wasn't aware of that, I thought LLVM had a gatekeeper model. > In a GitHub style, 100s of> people will have to wait for a merge from a few folks.I guess in that case, you'd probably give these people write access to the GitHub repo. It would probably be a good idea to prevent them from force-pushing and rebasing on the master branch so they cannot change recorded history. I am not sure whether GitHub allows that kind of restriction. > Those few folks> will have the additional job of merging lots of patches. It's more > like GCC, where maintainers are gate-keepers, and not one we have used > so far. I'm ok with it, but other people might not.It would be a considerable workload, but it would also be a considerable QA measure. From what I have seen, QA seems quite healthy on LLVM so I suppose the project will find that unattractive.>> Nobody is allowing 100s of committers to push to master, that would be >> silly. > > My personal preference is to not do that. I can't even cope with > myself pushing to master from different repositories, let alone 100s > of people. But I take it everyone else is indicating that would be the > way forward.I do not think this is on the table yet.> I haven't mastered git enough to know the pros and cons, but you seem > certain. Can you share your thoughts on the subject?Well, PRs are a good way to discuss proposed patches, so you can take the discussion there. Particularly if the git hoster gives you all the web forum thingies you want, including the ability to be helpful with Markdown. Also, a PR is easy to integrate once it's done. So while the PR route is very, very unattractive if there's a readymade patch that you just need to integrate, it's a useful tool while the patch is being built, assuming the PR runs through a website that offers a discussion. It's just very slick if you don't have to quote some code just to say "I'd like to have xxx changed on line 512 of that patch", if you can simply point to the line of code and can say "I'd like to have xxx changed here". You can write annotations while you read - it does encourage write-before-you-think so it does have its downsides, but it is still a big timesaver because everybody is known to be in the same boat wrt. what line of code one is talking about. Now that's the website side of things. On the git side of things, you need to make sure that history isn't rewritten official branches such as master. One way to achieve that is protecting those branches (that's the GitLab mechanism, GitHub implemented something similar last fall but I don't know if it's really the same); in that case people can still push to master without risking history. The other would be setting up a clone and going through PRs even if you don't need a review. PRs do not really protect you against a history rewrite, they can still kill history - but you'll have to force-pull that so the people who do the pull will be alerted to potential damage. On many projects, people still go through PRs even if they have direct write access. Simply to make sure that a second pair of eyes have looked at the code before it goes in. It does require disciplined committers, so this does not work too well in practice. To make it fly, committers need to priorize reviewing over committing and keep that up; I do not know whether that's a viable policy for LLVM. It does have its advantages if everybody knows his code will be eyeballed before it goes in though. But if it's a policy change for LLVM, this is something to consider for experimentation after the GH repo is set up, nothing that's up for decision right now.> The main reason to move to GitHub/Lab is one of cost: storage, > bandwidth and uptime, not one of tools.For that, public git hosting services are a no-brainer. You need to look at permissions because you can't simply set up gitolite, you have to live with whatever the service offers. > Even if we end up using the> GitHub interface in the future, I think we should consider a less > radical move first.The opportunities will be there from day one. Whether they are being used, or useful, is something to explore.> Of course, if the problems of moving to git but still following our > style becomes prohibitive, I think we should move to GitHub and use > their style, as IMHO, the cost argument is stronger than the tooling.The GitHub "flow" isn't the right one for every project, so the tooling does matter.