On Mon, Feb 27, 2023 at 04:24:33PM +0200, Nir Soffer wrote:> On Mon, Feb 27, 2023 at 3:56?PM Richard W.M. Jones <rjones at redhat.com> wrote: > > > > > > https://github.com/kubevirt/containerized-data-importer/issues/1520 > > > > Hi Eric, > > > > We had a question from the Kubevirt team related to the above issue. > > The question is roughly if it's possible to calculate the checksum of > > an image as an nbdkit filter and/or in the qemu block layer. > > > > Supplemental #1: could qemu-img convert calculate a checksum as it goes > > along? > > > > Supplemental #2: could we detect various sorts of common errors, such > > a webserver that is incorrectly configured and serves up an error page > > containing "<html>"; or something which is supposed to be a disk image > > but does not "look like" (in some ill-defined sense) a disk image, > > eg. it has no partition table. > > > > I'm not sure if qemu has any existing features covering the above (and > > I know for sure that nbdkit doesn't). > > > > One issue is that calculating a checksum involves a linear scan of the > > image, although we can at least skip holes. > > Kubvirt can use blksum > https://fosdem.org/2023/schedule/event/vai_blkhash_fast_disk/ > > But we need to package it for Fedora/CentOS Stream. > > I also work on "qemu-img checksum", getting more reviews on this can help: > Lastest version: > https://lists.nongnu.org/archive/html/qemu-block/2022-11/msg00971.html > Last reveiw are here: > https://lists.nongnu.org/archive/html/qemu-block/2022-12/ > > More work is needed on the testing framework changes.I think it would be more useful if (or in addition) it could compute the checksum of a stream which is being converted with 'qemu-img convert'. Extra points if it can compute the checksum over either the input or output stream. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top
On Mon, Feb 27, 2023 at 6:41?PM Richard W.M. Jones <rjones at redhat.com> wrote:> > On Mon, Feb 27, 2023 at 04:24:33PM +0200, Nir Soffer wrote: > > On Mon, Feb 27, 2023 at 3:56?PM Richard W.M. Jones <rjones at redhat.com> wrote: > > > > > > > > > https://github.com/kubevirt/containerized-data-importer/issues/1520 > > > > > > Hi Eric, > > > > > > We had a question from the Kubevirt team related to the above issue. > > > The question is roughly if it's possible to calculate the checksum of > > > an image as an nbdkit filter and/or in the qemu block layer. > > > > > > Supplemental #1: could qemu-img convert calculate a checksum as it goes > > > along? > > > > > > Supplemental #2: could we detect various sorts of common errors, such > > > a webserver that is incorrectly configured and serves up an error page > > > containing "<html>"; or something which is supposed to be a disk image > > > but does not "look like" (in some ill-defined sense) a disk image, > > > eg. it has no partition table. > > > > > > I'm not sure if qemu has any existing features covering the above (and > > > I know for sure that nbdkit doesn't). > > > > > > One issue is that calculating a checksum involves a linear scan of the > > > image, although we can at least skip holes. > > > > Kubvirt can use blksum > > https://fosdem.org/2023/schedule/event/vai_blkhash_fast_disk/ > > > > But we need to package it for Fedora/CentOS Stream. > > > > I also work on "qemu-img checksum", getting more reviews on this can help: > > Lastest version: > > https://lists.nongnu.org/archive/html/qemu-block/2022-11/msg00971.html > > Last reveiw are here: > > https://lists.nongnu.org/archive/html/qemu-block/2022-12/ > > > > More work is needed on the testing framework changes. > > I think it would be more useful if (or in addition) it could compute > the checksum of a stream which is being converted with 'qemu-img > convert'. Extra points if it can compute the checksum over either the > input or output stream.I thought about this, it could be a filter that you add in the graph that gives you checksum as a side effect of copying. But this requires disabling unordered writes, which is pretty bad for performance. But even if you compute the checksum during a transfer, you want to verify it by reading the transferred data from storage. Once you computed the checksum you can keep it for verifying the same image in the future. Nir