Dean Michael Berris via llvm-dev
2016-Jul-29 12:04 UTC
[llvm-dev] [RFC] One or many git repositories?
> On 29 Jul 2016, at 21:58, David Chisnall <david.chisnall at cl.cam.ac.uk> wrote: > > On 29 Jul 2016, at 12:35, Dean Michael Berris <dean.berris at gmail.com> wrote: >> >> I understand this, but why isn't "the repo you're interested in" just the megarepo (or monorepo) where every LLVM project resides? > > Your assumption is a downstream user of LLVM. As previously pointed out, we have downstream users of libc++ and the sanitizer runtimes who compile with gcc. For a downstream user of LLVM, the cost of getting everything else is in the noise. For a downstream user of libc++ who may want to contribute upstream, the overhead is huge. >Even then, are we seriously ignoring the fact that even if you did clone the whole repository including everything, that you can still build just the libc++ and sanitiser runtimes if you wanted to? Why is this "noise" of any importance to the users who get what they want and then some? I know some people use only numbered releases of LLVM and the projects. They can keep using those as long as LLVM provides them. Is it really impossible to just build non-LLVM dependent versions of libc++ or the sanitiser runtimes if they reside in one git megarepo?
Renato Golin via llvm-dev
2016-Jul-29 12:47 UTC
[llvm-dev] [RFC] One or many git repositories?
On 29 July 2016 at 13:04, Dean Michael Berris via llvm-dev <llvm-dev at lists.llvm.org> wrote:> Is it really impossible to just build non-LLVM dependent versions of libc++ or the sanitiser runtimes if they reside in one git megarepo?The more intricate the relationship between the components, the less we'll test for the alternative solutions. My use is solely from a toolchain point of view. For me, having it all in one blob would be perfect, and I would never have to worry about integrations again. (in a perfect world, etc...) But a good number of projects (and products) use LLVM trunk (not releases) and they use in slightly different ways. This has driven a lot of refactoring around the libraries over the last few years and I think it's a positive thing. A good number of *upstream* developers contribute to LLVM under those premises, and the harder we make for them, the less of them we'll have. I don't think that's a wise move. Furthermore, losing the ability to clearly separate things makes them become one disparate group, rather than two independent ones. cheers, --renato
Robinson, Paul via llvm-dev
2016-Jul-29 14:26 UTC
[llvm-dev] [RFC] One or many git repositories?
> -----Original Message----- > From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Dean > Michael Berris via llvm-dev > Sent: Friday, July 29, 2016 5:04 AM > To: David Chisnall > Cc: LLVM Developers; Bruce Hoult > Subject: Re: [llvm-dev] [RFC] One or many git repositories? > > > > On 29 Jul 2016, at 21:58, David Chisnall <david.chisnall at cl.cam.ac.uk> > wrote: > > > > On 29 Jul 2016, at 12:35, Dean Michael Berris <dean.berris at gmail.com> > wrote: > >> > >> I understand this, but why isn't "the repo you're interested in" just > the megarepo (or monorepo) where every LLVM project resides? > > > > Your assumption is a downstream user of LLVM. As previously pointed > out, we have downstream users of libc++ and the sanitizer runtimes who > compile with gcc. For a downstream user of LLVM, the cost of getting > everything else is in the noise. For a downstream user of libc++ who may > want to contribute upstream, the overhead is huge. > > > > Even then, are we seriously ignoring the fact that even if you did clone > the whole repository including everything, that you can still build just > the libc++ and sanitiser runtimes if you wanted to?Is it that easy to build a subset of a large checked-out tree? I haven't tried it but my impression is: not so much. Certainly the advertised tactics for configuring/building don't tell you how to do that. Somebody figuring out what it takes would be very constructive here, instead of just asserting it can't possibly be that hard.> Why is this "noise" of > any importance to the users who get what they want and then some?You want to drive to work? Here, have this semi-trailer; everything you want and then some. I believe David Chisnall up-thread cited a difference in checkout times on the order of a handful of seconds versus a couple of minutes. While naively it might seem not a big deal, over time and depending on what you are trying to do yes it can be a big burden. For example right now I have a glitch somewhere in my merge process. It's taking an extra 10-12 seconds longer to do something than I think it should, per commit. NBD right? Except when you're 100 commits behind and trying to catch up, now you're talking about >15 minutes wasted. Again in the grand scheme of things 15 minutes doesn't seem like much but it seriously affects my productivity; it's actually hard to come up with tasks that small that I can context-switch to and back easily. Interruptions like that really are bad for your ability to concentrate on the intellectual task of getting your patch to work. --paulr> > I know some people use only numbered releases of LLVM and the projects. > They can keep using those as long as LLVM provides them. > > Is it really impossible to just build non-LLVM dependent versions of > libc++ or the sanitiser runtimes if they reside in one git megarepo? > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Bruce Hoult via llvm-dev
2016-Jul-29 14:51 UTC
[llvm-dev] [RFC] One or many git repositories?
On Sat, Jul 30, 2016 at 2:26 AM, Robinson, Paul via llvm-dev < llvm-dev at lists.llvm.org> wrote:> > > Even then, are we seriously ignoring the fact that even if you did clone > > the whole repository including everything, that you can still build just > > the libc++ and sanitiser runtimes if you wanted to? > > Is it that easy to build a subset of a large checked-out tree? I haven't > tried it but my impression is: not so much. Certainly the advertised > tactics for configuring/building don't tell you how to do that. Somebody > figuring out what it takes would be very constructive here, instead of > just asserting it can't possibly be that hard. >Right now, no. The build system assumes that if you checked someone out then you want to build it. This needs to change. I believe David Chisnall up-thread cited a difference in checkout times> on the order of a handful of seconds versus a couple of minutes. While > naively it might seem not a big deal, over time and depending on what you > are trying to do yes it can be a big burden >That's a one time cost, not every time you do an update. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160730/456e6bab/attachment.html>
Renato Golin via llvm-dev
2016-Jul-29 14:52 UTC
[llvm-dev] [RFC] One or many git repositories?
On 29 July 2016 at 15:26, Robinson, Paul via llvm-dev <llvm-dev at lists.llvm.org> wrote:> I believe David Chisnall up-thread cited a difference in checkout times > on the order of a handful of seconds versus a couple of minutes. While > naively it might seem not a big deal, over time and depending on what you > are trying to do yes it can be a big burden.TL;DR: This thread is dead. Let's move on. I think the biggest fallacy in this thread is that changing process is cheap. It is certainly cheap for me to do "git foo" instead of "git bar" from now on. It's moderately expensive to change my buildbot configurations, Zorg's builders and re-test everything for public CI. It's a lot more expensive to change how distributions build their hundreds of thousands of packages over multiple LTS releases, or how downstream users like Sony, Apple or ARM re-factor their entire build systems (which very likely link to a lot of non-LLVM stuff), and then some. None of that is impossible, most of that is a "one off". Most of the companies and big projects "could" afford to do that. But there are two big points that people like me, Paul and David have been unsuccessfully trying to make obvious: 1. Not every LLVM user is as big as FreeBSD, Sony or Apple. There are a lot of very interesting projects (hobbyists, academia, professional) using Clang, LLVM, libc++, etc. that don't have the staff to do that move. Being a hobbyist myself, I know too well that, when a library radically changes the way they behave (like boost did every new release about 10 years ago), I will stop using it. 2. Changes in complex systems have unwanted larger consequences. Build systems are some of the most complex systems in existence because they're mostly irrational and patched together with duct tape and paper clips. What may be very simple for some build systems, could be impossible for others, and that's not the other's team's fault. So, if you have a complex build system yourself, and you spent some time and have figured out that it would be easy, you *cannot* assume it should be easy for everyone with an less or equally complex build systems. If you find it simple to change your own workflow towards this or that solution, you *cannot* assume everyone else should feel the same. This also doesn't diminish their intelligence or competence. Intelligent and competent people work in very different ways, and it's actually because of that fact that we can do such complex software works in a multitude of systems. If we were all equal, we wouldn't need to discuss anything. :) Mehdi said very early, and repeated many time, on some of the threads, something to the effect of: "Saying how hard or easy it is for you is an invalid argument, we need more concrete facts". I absolutely agree with that statement, but interpreting how easy or hard concrete facts would be fall on the same fallacy, so it doesn't bring us closer to consensus, it brings us closer to dissent. That is why I think this thread has already surpassed it's usefulness (for a long time), and we need a concrete write up on the proposal. (I hear it's in progress, let's wait for it).>From now on, I'd propose the discussion to be *just* about thisspecific proposal, preferably over a Phabricator review on the document. People that have strong opinions about it should wait for the survey. Just to reiterate, the survey is to collect opinions in a formal and non-passionate manner. It will not be a "majority vote", and we're not locked between these two solutions as they're absolutely drawn out in the documents, nor we are forced to take any decision if the community is clearly split. The last think I want is to destroy part of the community while trying to make it better. But this long thread is not doing any good either. cheers, --renato
Mehdi Amini via llvm-dev
2016-Jul-29 17:06 UTC
[llvm-dev] [RFC] One or many git repositories?
> On Jul 29, 2016, at 7:26 AM, Robinson, Paul via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > >> -----Original Message----- >> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org <mailto:llvm-dev-bounces at lists.llvm.org>] On Behalf Of Dean >> Michael Berris via llvm-dev >> Sent: Friday, July 29, 2016 5:04 AM >> To: David Chisnall >> Cc: LLVM Developers; Bruce Hoult >> Subject: Re: [llvm-dev] [RFC] One or many git repositories? >> >> >>> On 29 Jul 2016, at 21:58, David Chisnall <david.chisnall at cl.cam.ac.uk> >> wrote: >>> >>> On 29 Jul 2016, at 12:35, Dean Michael Berris <dean.berris at gmail.com> >> wrote: >>>> >>>> I understand this, but why isn't "the repo you're interested in" just >> the megarepo (or monorepo) where every LLVM project resides? >>> >>> Your assumption is a downstream user of LLVM. As previously pointed >> out, we have downstream users of libc++ and the sanitizer runtimes who >> compile with gcc. For a downstream user of LLVM, the cost of getting >> everything else is in the noise. For a downstream user of libc++ who may >> want to contribute upstream, the overhead is huge. >>> >> >> Even then, are we seriously ignoring the fact that even if you did clone >> the whole repository including everything, that you can still build just >> the libc++ and sanitiser runtimes if you wanted to? > > Is it that easy to build a subset of a large checked-out tree? I haven't > tried it but my impression is: not so much.If the layout is flat, what difficulty do you expect compared to today’s situation?> Certainly the advertised > tactics for configuring/building don't tell you how to do that. Somebody > figuring out what it takes would be very constructive here, instead of > just asserting it can't possibly be that hard. > >> Why is this "noise" of >> any importance to the users who get what they want and then some? > > You want to drive to work? Here, have this semi-trailer; everything > you want and then some. > > I believe David Chisnall up-thread cited a difference in checkout times > on the order of a handful of seconds versus a couple of minutes. While > naively it might seem not a big deal, over time and depending on what you > are trying to do yes it can be a big burden.There are still the read-only views, and the shallow clones that address non-commiters cases. For commiters, this is a one time cost, I have some difficulty to consider this seriously a “burden”.> > For example right now I have a glitch somewhere in my merge process. > It's taking an extra 10-12 seconds longer to do something than I think > it should, per commit. NBD right? Except when you're 100 commits behind > and trying to catch up, now you're talking about >15 minutes wasted.I don’t really understand what you’re talking about, but for downstream with complex integration (like we do), a single monorepo should be an important simplification of the process. (Otherwise you can still integrate for the read-only repos anyway). — Mehdi> Again in the grand scheme of things 15 minutes doesn't seem like much > but it seriously affects my productivity; it's actually hard to come up > with tasks that small that I can context-switch to and back easily. > Interruptions like that really are bad for your ability to concentrate > on the intellectual task of getting your patch to work. > --paulr > >> >> I know some people use only numbered releases of LLVM and the projects. >> They can keep using those as long as LLVM provides them. >> >> Is it really impossible to just build non-LLVM dependent versions of >> libc++ or the sanitiser runtimes if they reside in one git megarepo? >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160729/e9365dd0/attachment.html>
Dean Michael Berris via llvm-dev
2016-Jul-30 06:10 UTC
[llvm-dev] [RFC] One or many git repositories?
> On 29 Jul 2016, at 22:47, Renato Golin <renato.golin at linaro.org> wrote: > > On 29 July 2016 at 13:04, Dean Michael Berris via llvm-dev > <llvm-dev at lists.llvm.org> wrote: >> Is it really impossible to just build non-LLVM dependent versions of libc++ or the sanitiser runtimes if they reside in one git megarepo? > > The more intricate the relationship between the components, the less > we'll test for the alternative solutions. >I agree with this gem of an insight, thank you. But that doesn't mean we wouldn't test for those -- just that we should be vigilant about it and do make sure we support the various use-cases we already do, and then some.> My use is solely from a toolchain point of view. For me, having it all > in one blob would be perfect, and I would never have to worry about > integrations again. (in a perfect world, etc...) > > But a good number of projects (and products) use LLVM trunk (not > releases) and they use in slightly different ways. This has driven a > lot of refactoring around the libraries over the last few years and I > think it's a positive thing. A good number of *upstream* developers > contribute to LLVM under those premises, and the harder we make for > them, the less of them we'll have. I don't think that's a wise move. >I don't see how making it a mono-repo would make it harder for them (LLVM developers) to keep things un-broken for this use-case *if* we have infrastructure already testing the standalone builds (which, AFAICT, we do, because I've broken them a couple of times now :D). Note this is predicated on making sure we do have explicit tests for these situations and I 100% agree that we should have those. But that is beside the point of whether we have a mega-repo or 100 different smaller ones. (I exaggerate, we only have ~10 or so, I've already lost count). In fact I think having the many "independent" repositories makes it harder to test (as is already the case).> Furthermore, losing the ability to clearly separate things makes them > become one disparate group, rather than two independent ones. >Can you elaborate more on why keeping things separate is beneficial to: - Current and future LLVM vertical developers (those working on LLVM, Clang, compiler-rt, parallel_libs, etc.) - Downstream users who have to keep track of the separate projects and repositories in their local workflows - Casual contributors who find bugs and want to help clean something up From someone who's new to all this LLVM development, I'd really like to understand why it _seems_ like we really want to keep the status quo of "too hard to make changes and maintain". I understand the "engineering tradeoff" between not changing something that's already working, but there's also the principle of "continuous improvement" -- i.e. if a megarepo makes the development process simpler and enables us to support *more* downstream users *better*, maybe that's a strictly better situation than what we have now? Cheers
Dean Michael Berris via llvm-dev
2016-Jul-30 06:19 UTC
[llvm-dev] [RFC] One or many git repositories?
> On 30 Jul 2016, at 00:26, Robinson, Paul <paul.robinson at sony.com> wrote: > > > >> -----Original Message----- >> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Dean >> Michael Berris via llvm-dev >> Sent: Friday, July 29, 2016 5:04 AM >> To: David Chisnall >> Cc: LLVM Developers; Bruce Hoult >> Subject: Re: [llvm-dev] [RFC] One or many git repositories? >> >> >>> On 29 Jul 2016, at 21:58, David Chisnall <david.chisnall at cl.cam.ac.uk> >> wrote: >>> >>> On 29 Jul 2016, at 12:35, Dean Michael Berris <dean.berris at gmail.com> >> wrote: >>>> >>>> I understand this, but why isn't "the repo you're interested in" just >> the megarepo (or monorepo) where every LLVM project resides? >>> >>> Your assumption is a downstream user of LLVM. As previously pointed >> out, we have downstream users of libc++ and the sanitizer runtimes who >> compile with gcc. For a downstream user of LLVM, the cost of getting >> everything else is in the noise. For a downstream user of libc++ who may >> want to contribute upstream, the overhead is huge. >>> >> >> Even then, are we seriously ignoring the fact that even if you did clone >> the whole repository including everything, that you can still build just >> the libc++ and sanitiser runtimes if you wanted to? > > Is it that easy to build a subset of a large checked-out tree? I haven't > tried it but my impression is: not so much.I tried it for compiler-rt hosted in an LLVM checkout and it works just fine. I can't say for other libraries/tools in LLVM but if it isn't then that's something worth fixing (if that's something that's explicitly supported by the community).> Certainly the advertised > tactics for configuring/building don't tell you how to do that. Somebody > figuring out what it takes would be very constructive here, instead of > just asserting it can't possibly be that hard. >Sorry, I wasn't asserting anything, I was conjecturing (if that's even a word).>> Why is this "noise" of >> any importance to the users who get what they want and then some? > > You want to drive to work? Here, have this semi-trailer; everything > you want and then some. >I think that's a tenuous analogy -- if I only have to drive to work *once* and then get a new faster and easier to navigate mode of transport once I get there (maybe because the workplace will provide a better way once you've gotten there), then sure, I'll take that trailer and haul some stuff along the way too. :)> I believe David Chisnall up-thread cited a difference in checkout times > on the order of a handful of seconds versus a couple of minutes. While > naively it might seem not a big deal, over time and depending on what you > are trying to do yes it can be a big burden. >Sorry, not checkout times -- clone times. Or did you mean to be using SVN still? In that case then you should still be able to use the per-project mirror on GitHub using the SVN interface. If you were going to use git, then you clone it once and then 'git pull --rebase upstream master' or something similar.> For example right now I have a glitch somewhere in my merge process. > It's taking an extra 10-12 seconds longer to do something than I think > it should, per commit. NBD right? Except when you're 100 commits behind > and trying to catch up, now you're talking about >15 minutes wasted. > Again in the grand scheme of things 15 minutes doesn't seem like much > but it seriously affects my productivity; it's actually hard to come up > with tasks that small that I can context-switch to and back easily. > Interruptions like that really are bad for your ability to concentrate > on the intellectual task of getting your patch to work.I find this too if I do "git svn rebase". What I do today now though is I pull from the git mirrors, then "git svn rebase -l" -- the git pull takes a lot less time, and the local rebase just works like lightning. Your mileage may vary. Cheers