Tom Stellard via llvm-dev
2021-Jan-08 01:05 UTC
[llvm-dev] [RFC] Move py-mlir-release to new top-level repo in the LLVM org
On 1/7/21 3:17 PM, Stella Laurenzo wrote:> > > On Thu, Jan 7, 2021 at 2:40 PM Tom Stellard <tstellar at redhat.com > <mailto:tstellar at redhat.com>> wrote: > > On 1/7/21 10:55 AM, Stella Laurenzo via llvm-dev wrote: > > Hi folks, I would like to propose that we create a new top-level > repo in > > the LLVM organization for organizing the Python MLIR Releases (both > > daily and official numbered releases, whenever we are ready for > such a > > thing) and corresponding pushes to package repositories, etc. > > > > For those of use that are unfamiliar, can you explain what the "Python > MLIR Releases" are? > > > Sure: They are the python wheels and source distributions for the [MLIR > Python Bindings](https://mlir.llvm.org/docs/Bindings/Python/). The key > is that we do them in concordance with how Python packages get released > and push them through standard channels for deployment, and this > involves some gymnastics (of which, what I have will grow in some > complexity as we do this, based on the experience of other projects). > They basically include everything such that if you do a "pip install > mlir" you get a working package that is able to build and compile MLIR > based IR in a variety of forms. An ancillary function of them is to > enable downstream Python based projects to extend the system, so it > entails distributing enough headers and libraries to make this feasible. >Ok, so it's this python code: llvm-project/mlir/lib/Bindings/Python ?> > > I have prototyped such a release process in a personal repo: > > https://github.com/stellaraccident/mlir-py-release > > > > Additional development on that release process is currently > blocked on > > more work on the shared library organization in LLVM (discussed here > > > https://lists.llvm.org/pipermail/llvm-dev/2021-January/147567.html and > > being worked on independently) but it is useful as is and a > reasonable > > starting point for further work. > > > > I would propose that we just fork my current repo into a new one > in the > > LLVM organization and then take the necessary steps to get > > credentials/permissions/secrets set up in the new context. > > > > Some answers to questions that may come up: > > > > * *Why should this be a repo separate from llvm-project? *These > kind > > of automation repos tend to have a lot of "garbage" commits > that I > > think is best if they do not pollute the main repo (and also > don't > > face contention on automatic jobs bumping things, etc). They also > > tend to require special permissions and secrets that we will > want to > > more tightly control. They also make use of other GitHub features > > that it seems like we would like not polluting the main > development > > flow ("Releases" tab, Actions, etc). Also, this is the kind > of thing > > that tends to get revised en-masse periodically, and again, > it would > > be good to not pollute the monorepo. > > There really aren't many files in this repo, do you anticipate it > growing significantly? > > > Not terribly so. Just from some personal experience, the ways things are > done for Python packaging are somewhat... esoteric... from a normal C++ > build flow and necessitate certain directory layouts and such that I > felt were better left to their own thing (it is something that you want > to do exactly as everyone else does it). > > > > * *Why not land this in llvm-zorg? *llvm-zorg claims to be for > "LLVM > > Testing Infrastructure" and seems well scoped to that statement. > > What I am managing above is periodic, automated release tooling > > based on open-source CI systems (currently GitHub Actions), which > > are fairly standardized across the Python releasing > community, easy > > to set up, etc. > > llvm-zorg also handles generating the websites. My personal opinion is > that it would be OK to try to do this in llvm-zorg, but you're probably > better off asking Galina about that. I guess the downside of using > llvm-zorg is you don't get the releases tab. > > > That is a good reason to put it there. One of the actions that is not > implemented yet is for generating API docs (which is done post > build/install for the Python side, because it introspects a running system). > > The releases page is actually pretty important. For snapshot builds, > python's pip can just scrape it directly for published, installable > artifacts and without it, we would need to roll our own place to stash > such things. >Could you have the GitHub action directly submit the package to pip rather than having it scrape the release page? If we could, would there be any reason to have a release page? Would users be downloading from the release page or from pip?> > Why did you choose to write the checkout_repo.py script in python > rather > than using the GitHub checkout action, or writing your own custom > action? > > > Good question - that was a limitation in my knowledge at the time (need > to source the version from a file). Consider that a TODO to eliminate. >If you need anything more complicated than some of the builtin actions, you an add them to the llvm/actions repo. -Tom> > > * *What ultimately will the code in this repo do?* > > o Have periodic GitHub actions to select new LLVM revisions and > > schedule daily/snapshot releases. > > Do you have any idea of much of the GitHub actions resources this would > use? e.g. how many hours per week per Operating System? > > > Currently, each snapshot builds for about 30m on the free 2-core setups > per OS. However, this isn't presently compiling as much of LLVM that > will ultimately be needed. I have automation for another project where > we do build more/most of the backends as well, and that builds for > 1.25-1.5 hours per snapshot (and builds a fair bit more things unrelated > to LLVM, so just an upper bound estimate). On my other project, I found > that each minor python version added (of which, there are probably ~4 > LTS at any given time) added about 1min to each build. > > So if we are doing 2 snapshots a day and being conservative, 28 > hours/week/OS? > > I'm not running tests yet, so that will come with some costs. We will > probably choose to run just the python bindings tests per python version > (which are really cheap) and then run the full regression suite once per OS. > > > > o Have manual actions for triggering official, numbered > releases. > > o Facilities for building Python wheels for PyPi and house any > > additional metadata/automation needed for anaconda. > > o Builds releases for all supported operating systems > (currently > > Linux/CentOS7/manylinux2014, MacOS, and Windows) and > supported > > Python versions (currently 3.6, 3.7, 3.8, 3.9). > > o Publish release artifacts on the Releases tab for > daily/snapshot > > releases. > > o Provide a stable reference point for downstream projects that > > extend MLIR-Python and need to produce version-matched > artifacts > > of their own. > > * *Could this graduate to be more than "MLIR" python?* Maybe. I > chose > > the name because that is what I am focused on and didn't want to > > grab too much land. But there is nothing stopping this from > becoming > > automation for general LLVM monorepo+incubator Python releasing. > > I think it would be great to generalize this. I would also like to > automate parts the main LLVM release, and there seems to be some > overlap > with what you are doing. > > > Agreed. I actually found this quite easy to prototype. I think I spent a > grand total of ~a day on what is there (which isn't done yet, but isn't > super far off). It then took me ~3 days to adapt it to IREE > (https://github.com/google/iree), which is much more complicated (as it > has to build LLVM, a bunch of deps and TensorFlow). > > > -Tom > > > * *What if we don't do this?* > > o *Option A:* We keep running this in a private repo with the > > disclaimer that is currently at the top: "Note that this is a > > prototype of a real MLIR release process being run by a > member > > of the community. These are not official releases of the LLVM > > Foundation in any way, and they are likely only going to be > > useful to people actually working on LLVM/MLIR until we get > > things productionized." We would miss opportunities for > > convergence with other projects and would cause things to > fragment. > > o *Option B: *We only publish Python bindings in official LLVM > > release packages, and only for the Python version they > are built > > with. We don't release Python binaries through normal package > > management channels. > > > > > > Opinions? > > - Stella > > > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > >
Stella Laurenzo via llvm-dev
2021-Jan-08 02:08 UTC
[llvm-dev] [RFC] Move py-mlir-release to new top-level repo in the LLVM org
On Thu, Jan 7, 2021 at 5:05 PM Tom Stellard <tstellar at redhat.com> wrote:> On 1/7/21 3:17 PM, Stella Laurenzo wrote: > > > > > > On Thu, Jan 7, 2021 at 2:40 PM Tom Stellard <tstellar at redhat.com > > <mailto:tstellar at redhat.com>> wrote: > > > > On 1/7/21 10:55 AM, Stella Laurenzo via llvm-dev wrote: > > > Hi folks, I would like to propose that we create a new top-level > > repo in > > > the LLVM organization for organizing the Python MLIR Releases > (both > > > daily and official numbered releases, whenever we are ready for > > such a > > > thing) and corresponding pushes to package repositories, etc. > > > > > > > For those of use that are unfamiliar, can you explain what the > "Python > > MLIR Releases" are? > > > > > > Sure: They are the python wheels and source distributions for the [MLIR > > Python Bindings](https://mlir.llvm.org/docs/Bindings/Python/). The key > > is that we do them in concordance with how Python packages get released > > and push them through standard channels for deployment, and this > > involves some gymnastics (of which, what I have will grow in some > > complexity as we do this, based on the experience of other projects). > > They basically include everything such that if you do a "pip install > > mlir" you get a working package that is able to build and compile MLIR > > based IR in a variety of forms. An ancillary function of them is to > > enable downstream Python based projects to extend the system, so it > > entails distributing enough headers and libraries to make this feasible. > > > > Ok, so it's this python code: llvm-project/mlir/lib/Bindings/Python ? > > > > > > I have prototyped such a release process in a personal repo: > > > https://github.com/stellaraccident/mlir-py-release > > > > > > Additional development on that release process is currently > > blocked on > > > more work on the shared library organization in LLVM (discussed > here > > > > > https://lists.llvm.org/pipermail/llvm-dev/2021-January/147567.html > and > > > being worked on independently) but it is useful as is and a > > reasonable > > > starting point for further work. > > > > > > I would propose that we just fork my current repo into a new one > > in the > > > LLVM organization and then take the necessary steps to get > > > credentials/permissions/secrets set up in the new context. > > > > > > Some answers to questions that may come up: > > > > > > * *Why should this be a repo separate from llvm-project? *These > > kind > > > of automation repos tend to have a lot of "garbage" commits > > that I > > > think is best if they do not pollute the main repo (and also > > don't > > > face contention on automatic jobs bumping things, etc). They > also > > > tend to require special permissions and secrets that we will > > want to > > > more tightly control. They also make use of other GitHub > features > > > that it seems like we would like not polluting the main > > development > > > flow ("Releases" tab, Actions, etc). Also, this is the kind > > of thing > > > that tends to get revised en-masse periodically, and again, > > it would > > > be good to not pollute the monorepo. > > > > There really aren't many files in this repo, do you anticipate it > > growing significantly? > > > > > > Not terribly so. Just from some personal experience, the ways things are > > done for Python packaging are somewhat... esoteric... from a normal C++ > > build flow and necessitate certain directory layouts and such that I > > felt were better left to their own thing (it is something that you want > > to do exactly as everyone else does it). > > > > > > > * *Why not land this in llvm-zorg? *llvm-zorg claims to be for > > "LLVM > > > Testing Infrastructure" and seems well scoped to that > statement. > > > What I am managing above is periodic, automated release > tooling > > > based on open-source CI systems (currently GitHub Actions), > which > > > are fairly standardized across the Python releasing > > community, easy > > > to set up, etc. > > > > llvm-zorg also handles generating the websites. My personal opinion > is > > that it would be OK to try to do this in llvm-zorg, but you're > probably > > better off asking Galina about that. I guess the downside of using > > llvm-zorg is you don't get the releases tab. > > > > > > That is a good reason to put it there. One of the actions that is not > > implemented yet is for generating API docs (which is done post > > build/install for the Python side, because it introspects a running > system). > > > > The releases page is actually pretty important. For snapshot builds, > > python's pip can just scrape it directly for published, installable > > artifacts and without it, we would need to roll our own place to stash > > such things. > > > > Could you have the GitHub action directly submit the package to pip > rather than having it scrape the release page? If we could, would there > be any reason to have a release page? Would users be downloading from > the release page or from pip? >My team's preference while we are very pre-release like we are is to not pollute the pip namespace until we're sure we have what we want. Deploying to the local project's release page is a good way to have some people be able to use it earlier but also still have an appropriate barrier to entry that matches where its at in the life-cycle. Personal preference. Some projects end up always deploying from their release page because they can't comply with PyPi policies (usually around distro version, dependencies, etc), but I've charted this out and think we will stay compliant.> > > > Why did you choose to write the checkout_repo.py script in python > > rather > > than using the GitHub checkout action, or writing your own custom > > action? > > > > > > Good question - that was a limitation in my knowledge at the time (need > > to source the version from a file). Consider that a TODO to eliminate. > > > > If you need anything more complicated than some of the builtin actions, > you an add them to the llvm/actions repo. >Nice, thanks.> > -Tom > > > > > * *What ultimately will the code in this repo do?* > > > o Have periodic GitHub actions to select new LLVM revisions > and > > > schedule daily/snapshot releases. > > > > Do you have any idea of much of the GitHub actions resources this > would > > use? e.g. how many hours per week per Operating System? > > > > > > Currently, each snapshot builds for about 30m on the free 2-core setups > > per OS. However, this isn't presently compiling as much of LLVM that > > will ultimately be needed. I have automation for another project where > > we do build more/most of the backends as well, and that builds for > > 1.25-1.5 hours per snapshot (and builds a fair bit more things unrelated > > to LLVM, so just an upper bound estimate). On my other project, I found > > that each minor python version added (of which, there are probably ~4 > > LTS at any given time) added about 1min to each build. > > > > So if we are doing 2 snapshots a day and being conservative, 28 > > hours/week/OS? > > > > I'm not running tests yet, so that will come with some costs. We will > > probably choose to run just the python bindings tests per python version > > (which are really cheap) and then run the full regression suite once per > OS. > > > > > > > o Have manual actions for triggering official, numbered > > releases. > > > o Facilities for building Python wheels for PyPi and house > any > > > additional metadata/automation needed for anaconda. > > > o Builds releases for all supported operating systems > > (currently > > > Linux/CentOS7/manylinux2014, MacOS, and Windows) and > > supported > > > Python versions (currently 3.6, 3.7, 3.8, 3.9). > > > o Publish release artifacts on the Releases tab for > > daily/snapshot > > > releases. > > > o Provide a stable reference point for downstream projects > that > > > extend MLIR-Python and need to produce version-matched > > artifacts > > > of their own. > > > * *Could this graduate to be more than "MLIR" python?* Maybe. I > > chose > > > the name because that is what I am focused on and didn't want > to > > > grab too much land. But there is nothing stopping this from > > becoming > > > automation for general LLVM monorepo+incubator Python > releasing. > > > > I think it would be great to generalize this. I would also like to > > automate parts the main LLVM release, and there seems to be some > > overlap > > with what you are doing. > > > > > > Agreed. I actually found this quite easy to prototype. I think I spent a > > grand total of ~a day on what is there (which isn't done yet, but isn't > > super far off). It then took me ~3 days to adapt it to IREE > > (https://github.com/google/iree), which is much more complicated (as it > > has to build LLVM, a bunch of deps and TensorFlow). > > > > > > -Tom > > > > > * *What if we don't do this?* > > > o *Option A:* We keep running this in a private repo with > the > > > disclaimer that is currently at the top: "Note that this > is a > > > prototype of a real MLIR release process being run by a > > member > > > of the community. These are not official releases of the > LLVM > > > Foundation in any way, and they are likely only going to > be > > > useful to people actually working on LLVM/MLIR until we > get > > > things productionized." We would miss opportunities for > > > convergence with other projects and would cause things to > > fragment. > > > o *Option B: *We only publish Python bindings in official > LLVM > > > release packages, and only for the Python version they > > are built > > > with. We don't release Python binaries through normal > package > > > management channels. > > > > > > > > > > Opinions? > > > - Stella > > > > > > _______________________________________________ > > > LLVM Developers mailing list > > > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210107/1bc5c5a1/attachment.html>