Richard W.M. Jones
2019-Jun-10 15:35 UTC
Re: [Libguestfs] 1.39 proposal: Let's split up the libguestfs git repo and tarballs
Sorry for the late reply to this ... On Tue, Apr 30, 2019 at 06:28:01PM +0200, Pino Toscano wrote:> On Friday, 9 February 2018 19:01:53 CEST Richard W.M. Jones wrote: > > My contention is that the libguestfs git repository is too large and > > unwieldy. There are too many separate, unrelated projects and as a > > result of that the source has too many dependencies and takes too long > > to build and test. > > > > The project divides (sort of) naturally into layers -- the library, > > the bindings, the various virt tools -- and could be split along those > > lines into separate projects which can then be released and evolve at > > their own pace. > > As also other answers to this email say, splitting tools, and bindings > may be very complex, and thus for now it is still a too far goal. > > However... > > > My suggested split would be something like this: > > > > [...] > > virt-v2v and virt-p2v > > I'd rather split virt-p2v in its own repository. There are various > reasons for this: > - it does not use libguestfs (the library), just the tools for testing > stuff > - the communication with virt-v2v is done via network, and its > capabilities are dynamically probed (so theoretically virt-p2v, and > virt-v2v can be used even when their versions are odd) > - it is written only in C > > However, even if it looks simple, in reality there are number of common > things used from the rest of the libguestfs tree: > 1) gnulibWe hardly use gnulib in virt-p2v. I think it's only used for ignore-value.h, getprogname.h, and c-ctype.h, all of which are likely to be easily worked around.> 2) some build system bits (e.g. m4/guestfs-v2v.m4)Right, although this in itself should be split up, so no bad thing.> 3) auto-cleanup bits (e.g. CLEANUP_FREE), although only few are used > (CLEANUP_FREE, CLEANUP_FREE_STRING_LIST, CLEANUP_PCLOSE, > CLEANUP_FCLOSE, and CLEANUP_XMLFREETEXTWRITER) > 4) other internal macros, i.e. guestfs-utils.hCommon code is a bit tricker, as is ...> 5) the list of credits generated by the generator > (i.e. generator/authors.ml) > 6) the p2v configuration generated by the generator > (i.e. generator/p2v_config.ml)... the generator and ...> 7) test images/data (phony images, and virt-tools)test data.> 8) the miniexpect module, right now out of the p2v subdirectoryThis is only used by virt-p2v I think, so it could go with virt-p2v or be made into a separate project.> Possible solutions may/might be: > 1) add own submodule (use its own set of modules)I think we should ditch gnulib as much as possible, so see above.> 2) copy/implement them them locally: luckly they are not many, so > inlining them in configure.ac will not be a problem; the common > bits (e.g. the distro detection from os-release) can be split in > its own module in libguestfs, copying it in p2v > 3/4) have a local version of them; not pretty, although they are not > that many > 5) this list is reflected in two places: the p2v/about-authors.c file, > and the AUTHORS file (theoretically mandatory for automake, unless > "foreign" is used, which it is); my idea was to go back to a manually > written about-authors.c file without the libguestfs credits, leaving > the few p2v ones easy to manage; the same for the AUTHORS file > 6) this is a bit more complex: my idea was to keep it as OCaml script > to run at build time, instead of being statically shipped at dist > time > 7) create their own versions at test time using guestfish/virt-builder; > maybe use a fedora image, instead of a phony windows one (will avoid > hivex for the tests) > 8)So while I'm not a massive fan of git submodules, now that I have used them a few times with riscv stuff, they do solve a certain problem as long as they are managed carefully. I think the common code and the generator are cases where a submodule or two would work. Does this mean we need to move immediately to a submodule if just splitting virt-p2v, or copy code as you suggest? Maybe not, because you can imagine for just this project copying the code needed from the common/ directory, and creating a new "mini-generator" for the project which handles the little bits that need to be generated in virt-p2v. However in the long term if we split up everything a submodule or two does seem to make sense, so maybe we should start there?> The other problem is how to split the repository, as the various bits > are in different places: > a) git filter-branch --subdirectory-filter p2v > + very small repo with the current p2v subdirectory > + preserves the history of the p2v subdirectory, with branches and tags > - missing all the other bits, which will have no history > - not usable to build older releases (e.g. for bisecting)I'm not exactly sure what this does. Is this something to do with preserving the history? TBH I don't think we need to bother with the history -- it exists still in libguestfs.git.> b) create a work branch in libguestfs, then in that branch move/copy all > the stuff making the p2v subdirectory build standalone there, and then > import the content of the p2v subdirectory of that branch in a new empty > repo > + very small repo with the current p2v subdirectory > - no history, no tags nor branches > + using a graft it is possible to "stitch" the history of the new repo > with the work branch in libguestfs > > c) git filter-branch to remove all the bits not related to p2v from all > the commits > + not that big repo > + preserves the history of all the content, with branches and tags > - will take a very long time to create (e.g. iterate over and over to > find out what to remove) > - not usable to build older releases (e.g. for bisecting)Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/
Daniel P. Berrangé
2019-Jun-10 15:51 UTC
Re: [Libguestfs] 1.39 proposal: Let's split up the libguestfs git repo and tarballs
On Mon, Jun 10, 2019 at 04:35:52PM +0100, Richard W.M. Jones wrote:> Sorry for the late reply to this ... > > On Tue, Apr 30, 2019 at 06:28:01PM +0200, Pino Toscano wrote: > > The other problem is how to split the repository, as the various bits > > are in different places: > > a) git filter-branch --subdirectory-filter p2v > > + very small repo with the current p2v subdirectory > > + preserves the history of the p2v subdirectory, with branches and tags > > - missing all the other bits, which will have no history > > - not usable to build older releases (e.g. for bisecting) > > I'm not exactly sure what this does. Is this something to do with > preserving the history? TBH I don't think we need to bother with the > history -- it exists still in libguestfs.git.This is the approach we took in libvirt when splitting the python module out to its own repo. We didn't bother to make older versions build. Specifically the commands we ran were[1]: $ git clone libvirt libvirt-python $ cd libvirt-python $ git filter-branch --subdirectory-filter python --tag-name-filter cat -- --all $ git for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git update-ref -d $ git reflog expire --expire=now --all $ git gc --prune=now I can't remember precisely what each of those steps does, but they are all about dropping as much cruft as possible from the new git repo. [1] https://www.redhat.com/archives/libvir-list/2013-September/msg00413.html Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
Pino Toscano
2019-Jul-01 16:10 UTC
[Libguestfs] 1.39 proposal: Let's split up the libguestfs git repo and tarballs
On Monday, 10 June 2019 17:35:52 CEST Richard W.M. Jones wrote:> Sorry for the late reply to this ... > > On Tue, Apr 30, 2019 at 06:28:01PM +0200, Pino Toscano wrote: > > On Friday, 9 February 2018 19:01:53 CEST Richard W.M. Jones wrote: > > > My contention is that the libguestfs git repository is too large and > > > unwieldy. There are too many separate, unrelated projects and as a > > > result of that the source has too many dependencies and takes too long > > > to build and test. > > > > > > The project divides (sort of) naturally into layers -- the library, > > > the bindings, the various virt tools -- and could be split along those > > > lines into separate projects which can then be released and evolve at > > > their own pace. > > > > As also other answers to this email say, splitting tools, and bindings > > may be very complex, and thus for now it is still a too far goal. > > > > However... > > > > > My suggested split would be something like this: > > > > > > [...] > > > virt-v2v and virt-p2v > > > > I'd rather split virt-p2v in its own repository. There are various > > reasons for this: > > - it does not use libguestfs (the library), just the tools for testing > > stuff > > - the communication with virt-v2v is done via network, and its > > capabilities are dynamically probed (so theoretically virt-p2v, and > > virt-v2v can be used even when their versions are odd) > > - it is written only in C > > > > However, even if it looks simple, in reality there are number of common > > things used from the rest of the libguestfs tree: > > 1) gnulib > > We hardly use gnulib in virt-p2v. I think it's only used for > ignore-value.h, getprogname.h, and c-ctype.h, all of which are likely > to be easily worked around.True, however for now it can stay, as it is one obstacle less for the split.> > 3) auto-cleanup bits (e.g. CLEANUP_FREE), although only few are used > > (CLEANUP_FREE, CLEANUP_FREE_STRING_LIST, CLEANUP_PCLOSE, > > CLEANUP_FCLOSE, and CLEANUP_XMLFREETEXTWRITER) > > 4) other internal macros, i.e. guestfs-utils.h > > Common code is a bit tricker, as is ...So far it is ~4K of bits of code copied, with ~9K more of straight copies of libxml2-cleanups.c + libxml2-writer-macros.h from common/utils.> > 5) the list of credits generated by the generator > > (i.e. generator/authors.ml) > > 6) the p2v configuration generated by the generator > > (i.e. generator/p2v_config.ml) > > ... the generator and ...(5) is more shared with the rest, while (6) is basically p2v-only material.> > 7) test images/data (phony images, and virt-tools) > > test data.Luckly this is easy to recreate locally.> > 8) the miniexpect module, right now out of the p2v subdirectory > > This is only used by virt-p2v I think, so it could go with virt-p2v or > be made into a separate project.Right, the upstream is somewhere else, so another "import from $URL" commit will not be any worse than what we have now.> > Possible solutions may/might be: > > 1) add own submodule (use its own set of modules) > > I think we should ditch gnulib as much as possible, so see above.\Surely we can work on removing it after the split, step by step, if needed/wanted.> So while I'm not a massive fan of git submodules, now that I have used > them a few times with riscv stuff, they do solve a certain problem as > long as they are managed carefully. I think the common code and the > generator are cases where a submodule or two would work.TBH I've always found submodules tricky and problematic to use: - they are fixed to a certain revision (so no way to dynamically follow the branch of another repo) - the URL is the same for all the users, meaning you cannot reuse the same authenticated/secure protocols that your repo has - they create a certain burden when switching to a tag/branch/commit whose revision of a submodule is different than what is at the current branch - even more problematic when switching commit, and in the old commit a subdirectory is a real directory while in the latest HEAD is a submodule (or viceversa)> Does this mean we need to move immediately to a submodule if just > splitting virt-p2v, or copy code as you suggest? Maybe not, because > you can imagine for just this project copying the code needed from the > common/ directory, and creating a new "mini-generator" for the project > which handles the little bits that need to be generated in virt-p2v.I'm actually solving in a different way, i.e. avoiding altogether the generator for p2v stuff.> However in the long term if we split up everything a submodule or two > does seem to make sense, so maybe we should start there?ATM I have enough work needed just to split p2v, so I'd prefer to delay this conversation to a later time...> > The other problem is how to split the repository, as the various bits > > are in different places: > > a) git filter-branch --subdirectory-filter p2v > > + very small repo with the current p2v subdirectory > > + preserves the history of the p2v subdirectory, with branches and tags > > - missing all the other bits, which will have no history > > - not usable to build older releases (e.g. for bisecting) > > I'm not exactly sure what this does. Is this something to do with > preserving the history? TBH I don't think we need to bother with the > history -- it exists still in libguestfs.git.Yes, this is for preserving history, at least for the most important parts (the sources of p2v). -- Pino Toscano -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part. URL: <http://listman.redhat.com/archives/libguestfs/attachments/20190701/0bbb4489/attachment.sig>
Richard W.M. Jones
2019-Jul-01 20:47 UTC
Re: [Libguestfs] 1.39 proposal: Let's split up the libguestfs git repo and tarballs
On Mon, Jul 01, 2019 at 06:10:48PM +0200, Pino Toscano wrote:> On Monday, 10 June 2019 17:35:52 CEST Richard W.M. Jones wrote: > > So while I'm not a massive fan of git submodules, now that I have used > > them a few times with riscv stuff, they do solve a certain problem as > > long as they are managed carefully. I think the common code and the > > generator are cases where a submodule or two would work. > > TBH I've always found submodules tricky and problematic to use: > - they are fixed to a certain revision (so no way to dynamically follow > the branch of another repo) > - the URL is the same for all the users, meaning you cannot reuse the > same authenticated/secure protocols that your repo has > - they create a certain burden when switching to a tag/branch/commit > whose revision of a submodule is different than what is at the current > branch > - even more problematic when switching commit, and in the old commit > a subdirectory is a real directory while in the latest HEAD is a > submodule (or viceversa)I mean, I don't disagree with any of this :-) For riscv we pinned Linux kernel and various toolchains at precise commits, and then only moved those forwards as we tested new combinations. Anyhow, whatever works.> > Does this mean we need to move immediately to a submodule if just > > splitting virt-p2v, or copy code as you suggest? Maybe not, because > > you can imagine for just this project copying the code needed from the > > common/ directory, and creating a new "mini-generator" for the project > > which handles the little bits that need to be generated in virt-p2v. > > I'm actually solving in a different way, i.e. avoiding altogether the > generator for p2v stuff.Hmm. There are parts of the current generator that apply to virt-p2v. Can we split those parts of the generator out to have a new generator that only applies to p2v? I find the generated config stuff useful, and in fact have a non-upstream patch to enhance it some more. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top
Seemingly Similar Threads
- Re: 1.39 proposal: Let's split up the libguestfs git repo and tarballs
- Re: 1.39 proposal: Let's split up the libguestfs git repo and tarballs
- Re: 1.39 proposal: Let's split up the libguestfs git repo and tarballs
- Re: 1.39 proposal: Let's split up the libguestfs git repo and tarballs
- Re: 1.39 proposal: Let's split up the libguestfs git repo and tarballs