Brian Candler
2021-Jul-13 13:05 UTC
[Libguestfs] supermin root: race condition with multiple drives
Hi, I discovered an issue when using libguestfs with large numbers of attached disks.? I submitted the details to github: https://github.com/libguestfs/libguestfs/issues/69 ... and then discovered that the mailing list is the right place, not github.? Sorry about that! The problem is: I have batches of 40 or 50 qcow2 images to write files to.? It is very slow to start a separate libguestfs appliance for each one, so what I do is to start a single one with 40 or 50 disks attached, and then mount, upload and unmount each one in turn. What I find is that sometimes the disks are attached in the wrong order, such that the kernel tries to use one of these qcow2 files as its root disk, instead of the supermin appliance image.? This seems to happen more often when the system is under load, such as when running multiple libguestfs instances concurrently (I have 15 or 20 different versions of these batches of 40-50 disks to create, so to speed things up, I run them concurrently). This is all "userland" stuff so ought to work fine under load, but the supermin kernel booting issue messes it up intermittently. Anyway, the github ticket has full details, including standalone scripts which can reproduce the problem on my system.? I'd be grateful if someone could take a look. Many thanks, Brian Candler.
Richard W.M. Jones
2021-Jul-15 09:30 UTC
[Libguestfs] supermin root: race condition with multiple drives
On Tue, Jul 13, 2021 at 02:05:48PM +0100, Brian Candler wrote:> Hi, > > I discovered an issue when using libguestfs with large numbers of > attached disks.? I submitted the details to github: > > https://github.com/libguestfs/libguestfs/issues/69 > > ... and then discovered that the mailing list is the right place, > not github.? Sorry about that! > > The problem is: I have batches of 40 or 50 qcow2 images to write > files to.? It is very slow to start a separate libguestfs appliance > for each one, so what I do is to start a single one with 40 or 50 > disks attached, and then mount, upload and unmount each one in turn. > > What I find is that sometimes the disks are attached in the wrong > order, such that the kernel tries to use one of these qcow2 files as > its root disk, instead of the supermin appliance image.? This seems > to happen more often when the system is under load, such as when > running multiple libguestfs instances concurrently (I have 15 or 20 > different versions of these batches of 40-50 disks to create, so to > speed things up, I run them concurrently).It could be this issue: https://bugzilla.redhat.com/show_bug.cgi?id=1804207 Fixed in libguestfs 1.42.0 (commit bca9b94fc59377 and others). Rich.> This is all "userland" stuff so ought to work fine under load, but > the supermin kernel booting issue messes it up intermittently. > > Anyway, the github ticket has full details, including standalone > scripts which can reproduce the problem on my system.? I'd be > grateful if someone could take a look. > > Many thanks, > > Brian Candler. > > > _______________________________________________ > Libguestfs mailing list > Libguestfs at redhat.com > https://listman.redhat.com/mailman/listinfo/libguestfs-- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org