Hi Eric, I wonder if you have any thoughts about this build failure in tests/test-nozero.sh? https://koji.fedoraproject.org/koji/taskinfo?taskID=48259627 log: https://kojipkgs.fedoraproject.org//work/tasks/9762/48259762/build.log The error is “nozero6.img was trimmed by mistake”. I added “set -x” to the script earlier today so we can see exactly what's wrong, and it is that: ++ stat -c %b nozero2.img ++ stat -c %b nozero6.img + test 4096 '!=' 2048 + echo 'nozero6.img was trimmed by mistake' AFAICT what this means is that nozero2.img is growing during the test (from 2048 to 4096 blocks). When I run the test locally this file stays at 2048 blocks the whole time, and the test does not fail. One other unfortunate problem is that Fedora is having lots of toolchain problems right now (see Fedora devel list passim) so we cannot really be sure that *any* other tool we are using has been built correctly :-( I've already disabled LTO in qemu and libguestfs, but possibly there are other toolchain bugs. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW
On Fri, Jul 31, 2020 at 01:07:16PM +0100, Richard W.M. Jones wrote:>Hi Eric, > >I wonder if you have any thoughts about this build failure in >tests/test-nozero.sh? > > https://koji.fedoraproject.org/koji/taskinfo?taskID=48259627 > log: https://kojipkgs.fedoraproject.org//work/tasks/9762/48259762/build.log > >The error is “nozero6.img was trimmed by mistake”. I added “set -x” >to the script earlier today so we can see exactly what's wrong, and it >is that: > > ++ stat -c %b nozero2.img > ++ stat -c %b nozero6.img > + test 4096 '!=' 2048 > + echo 'nozero6.img was trimmed by mistake' > >AFAICT what this means is that nozero2.img is growing during the test >(from 2048 to 4096 blocks). When I run the test locally this file >stays at 2048 blocks the whole time, and the test does not fail. >Correct me if I am wrong, but that means that ALL images from 2 up to 5 grew from 2048 to 4096 and only image #6 was kept at the original size, if I am reading the code correctly. It still does not help me to understand it, but it might help you. The only other thing would be to check the size instead of the number of blocks. But I would be surprised if the block size changed.>One other unfortunate problem is that Fedora is having lots of >toolchain problems right now (see Fedora devel list passim) so we >cannot really be sure that *any* other tool we are using has been >built correctly :-( I've already disabled LTO in qemu and libguestfs, >but possibly there are other toolchain bugs. > >Rich. > >-- >Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones >Read my programming and virtualization blog: http://rwmj.wordpress.com >Fedora Windows cross-compiler. Compile Windows programs, test, and >build Windows installers. Over 100 libraries supported. >http://fedoraproject.org/wiki/MinGW > >_______________________________________________ >Libguestfs mailing list >Libguestfs@redhat.com >https://www.redhat.com/mailman/listinfo/libguestfs
On 7/31/20 7:07 AM, Richard W.M. Jones wrote:> Hi Eric, > > I wonder if you have any thoughts about this build failure in > tests/test-nozero.sh? > > https://koji.fedoraproject.org/koji/taskinfo?taskID=48259627 > log: https://kojipkgs.fedoraproject.org//work/tasks/9762/48259762/build.log > > The error is “nozero6.img was trimmed by mistake”. I added “set -x” > to the script earlier today so we can see exactly what's wrong, and it > is that: > > ++ stat -c %b nozero2.img > ++ stat -c %b nozero6.img > + test 4096 '!=' 2048 > + echo 'nozero6.img was trimmed by mistake'Hmm, maybe it is file-system dependent (not all filesystems reserve the same amount of space for a sparse file - that's something that qemu iotests keep on hitting).> > AFAICT what this means is that nozero2.img is growing during the test > (from 2048 to 4096 blocks). When I run the test locally this file > stays at 2048 blocks the whole time, and the test does not fail.Growing a small amount but still being sparse is different from growing a huge amount to be non-sparse altogether. I'll have to double-check what the test is actually doing (the size of the files involved) and see if we can relax the test into allowing a range of sizes that show a file is still reasonably sparse. But knowing what filesystem koji is using may matter (for example, if this is something that shows up on btrfs but not ext4, that would explain why koji fails when I pass locally...)> > One other unfortunate problem is that Fedora is having lots of > toolchain problems right now (see Fedora devel list passim) so we > cannot really be sure that *any* other tool we are using has been > built correctly :-( I've already disabled LTO in qemu and libguestfs, > but possibly there are other toolchain bugs. > > Rich. >-- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
On Fri, Jul 31, 2020 at 09:11:51AM -0500, Eric Blake wrote:> Growing a small amount but still being sparse is different from > growing a huge amount to be non-sparse altogether. I'll have to > double-check what the test is actually doing (the size of the files > involved) and see if we can relax the test into allowing a range of > sizes that show a file is still reasonably sparse. But knowing what > filesystem koji is using may matter (for example, if this is > something that shows up on btrfs but not ext4, that would explain > why koji fails when I pass locally...)So about btrfs: While Fedora is planning to use btrfs for desktop installs in future, that hasn't happened so far, isn't planned for the server editions, and I doubt will ever affect Koji because that uses RHEL. Nevertheless I thought it'd be interesting to try it because I don't think I've ever used btrfs for this. $ nbdkit memory 4G allocator=zstd # modprobe nbd # nbd-client -b 512 localhost /dev/nbd0 # mkfs.btrfs /dev/nbd0 # mount /dev/nbd0 /tmp/mnt # chown rjones.users /tmp/mnt I built nbdkit from git on this filesystem and ran the tests and it was all fine. After the tests: $ stat -f /tmp/mnt File: "/tmp/mnt" ID: 842a95913d293347 Namelen: 255 Type: btrfs Block size: 4096 Fundamental block size: 4096 Blocks: Total: 1048576 Free: 801634 Available: 736710 Inodes: Total: 0 Free: 0 $ df -h /tmp/mnt Filesystem Size Used Avail Use% Mounted on /dev/nbd0 4.0G 968M 2.9G 26% /tmp/mnt Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/
On Fri, Jul 31, 2020 at 5:12 PM Eric Blake <eblake@redhat.com> wrote:> > On 7/31/20 7:07 AM, Richard W.M. Jones wrote: > > Hi Eric, > > > > I wonder if you have any thoughts about this build failure in > > tests/test-nozero.sh? > > > > https://koji.fedoraproject.org/koji/taskinfo?taskID=48259627 > > log: https://kojipkgs.fedoraproject.org//work/tasks/9762/48259762/build.log > > > > The error is “nozero6.img was trimmed by mistake”. I added “set -x” > > to the script earlier today so we can see exactly what's wrong, and it > > is that: > > > > ++ stat -c %b nozero2.img > > ++ stat -c %b nozero6.img > > + test 4096 '!=' 2048 > > + echo 'nozero6.img was trimmed by mistake' > > Hmm, maybe it is file-system dependent (not all filesystems reserve the > same amount of space for a sparse file - that's something that qemu > iotests keep on hitting).In qemu iotets we check how much space an empty file is using to avoid issues with filesystem allocating extra blocks. https://github.com/qemu/qemu/blob/d74824cf7c8b352f9045e949dc636c7207a41eee/tests/qemu-iotests/175#L82 Not sure this is relevant to this case since this grows from 2048 blocks to 4096, not 1 extra block. In ovit we avoid these issues by testing on filesytems we control instead on random filesystem provided by the CI environment. We use this: https://github.com/nirs/userstorage The idea is that you configure some combinations you want to test, like: https://github.com/oVirt/ovirt-imageio/blob/master/storage.py Prepare this storage as root before running the tests: https://github.com/oVirt/ovirt-imageio/blob/7676d97e49eb1399ed5256e08786c006ce5ff9ee/Makefile#L17 Then the tests pick some of the available storage as needed: https://github.com/oVirt/ovirt-imageio/blob/7676d97e49eb1399ed5256e08786c006ce5ff9ee/daemon/test/backends_file_test.py#L40 Tests using the user_file fixture above are run once for every parameter, so we know the code works on 3 different filesystems with 2 logical block size. Nir> > AFAICT what this means is that nozero2.img is growing during the test > > (from 2048 to 4096 blocks). When I run the test locally this file > > stays at 2048 blocks the whole time, and the test does not fail. > > Growing a small amount but still being sparse is different from growing > a huge amount to be non-sparse altogether. I'll have to double-check > what the test is actually doing (the size of the files involved) and see > if we can relax the test into allowing a range of sizes that show a file > is still reasonably sparse. But knowing what filesystem koji is using > may matter (for example, if this is something that shows up on btrfs but > not ext4, that would explain why koji fails when I pass locally...) > > > > > One other unfortunate problem is that Fedora is having lots of > > toolchain problems right now (see Fedora devel list passim) so we > > cannot really be sure that *any* other tool we are using has been > > built correctly :-( I've already disabled LTO in qemu and libguestfs, > > but possibly there are other toolchain bugs. > > > > Rich. > > > > -- > Eric Blake, Principal Software Engineer > Red Hat, Inc. +1-919-301-3226 > Virtualization: qemu.org | libvirt.org > > _______________________________________________ > Libguestfs mailing list > Libguestfs@redhat.com > https://www.redhat.com/mailman/listinfo/libguestfs
One thing I noticed which is a bit odd is: $ rm file; for f in {0..1023}; do printf '%1024s' .; done > file; stat -c "%b %B" file 2048 512 $ rm file; for f in {0..1023}; do printf '%1024s' . >> file; done ; stat -c "%b %B" file 3968 512 The second method is how we currently create the file. Since looking through the history there seems to be no reason for that I'm going to push a commit which changes file creation to the first method, and it may be slightly faster too. However it makes me wonder if the file is not laid out in a single extent and if that might be causing our problems. Being only able to reproduce this on Koji makes a bit tedious to test theories :-( Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW