Richard W.M. Jones
2022-Jun-15 10:09 UTC
[Libguestfs] Kernel driver I/O block size hinting
On Tue, Jun 14, 2022 at 08:30:15PM +0100, Nikolaus Rath wrote:> On Jun 14 2022, "Richard W.M. Jones" <rjones at redhat.com> wrote: > > I think we should set logical_block_size == physical_block_size => > MAX (512, NBD minimum block size constraint). > > Why the lower bound of 512?I suspect the kernel can't handle sector sizes smaller than 512 bytes. By default the NBD protocol advises advertising a minimum size of 1 byte, and I'm almost certain setting logical_block_size == 1 would break everything.> > What should happen to the nbd-client -b option? > > Perhaps it should become the lower-bound (instead of the hardcoded 512)? > That's assuming there is a reason for having a client-specified lower > bound.Right, I don't think there's a reason to continue with the -b option. I only use it to set -b 512 to work around the annoying default in older versions (which was 1024).> > (4) Kernel blk_queue_max_hw_sectors: This is documented as: "set max > > sectors for a request ... Enables a low level driver to set a hard > > upper limit, max_hw_sectors, on the size of requests." > > > > Current behaviour of nbd.ko is that we set this to 65536 (sectors? > > blocks?), which for 512b sectors is 32M. > > FWIW, on my 5.16 kernel, the default is 65 kB (according to > /sys/block/nbdX/queue/max_sectors_kb x 512b).I have: $ cat /sys/devices/virtual/block/nbd0/queue/max_hw_sectors_kb 32768 (ie. 32 MB) which I think comes from the nbd module setting: blk_queue_max_hw_sectors(disk->queue, 65536); multiplied by 512b sectors.> > I think we could set this to MIN (32M, NBD maximum block size constraint), > > converting the result to sectors. > > I don't think that's right. Rather, it should be NBD's preferred block > size. > > Setting this to the preferred block size means that NBD requests will be > this large whenever there are enough sequential dirty pages, and that no > requests will ever be larger than this. I think this is exactly what the > NBD server would like to have.This kernel setting limits the maximum request size on the queue. In my testing reading and writing files with the default [above] the kernel never got anywhere near sending multi-megabyte requests. In fact the largest request it sent was 128K, even when I did stuff like: # dd if=/dev/zero of=/tmp/mnt/zero bs=100M count=10 128K happens to be 2 x blk_queue_io_opt, but I need to do more testing to see if that relationship always holds.> Settings this to the maximum block size would mean that NBD requests > will exceed the preferred size whenever there are enough sequential > dirty pages (while still obeying the maximum). This seems strictly > worse. > > Unrelated to the proposed changes (all of which I think are technically > correct), I am wondering if this will have much practical benefits. As > far as I can tell, the kernel currently aligns NBD requests to the > logical/physical block size rather than the size of the NBD request. Are > there NBD servers that would benefit from the kernel honoring the > preferred blocksize if the data is not also aligned to this blocksize?I'm not sure I parsed this. Can you give an example? Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com nbdkit - Flexible, fast NBD server with plugins https://gitlab.com/nbdkit/nbdkit
Richard W.M. Jones
2022-Jun-15 10:35 UTC
[Libguestfs] Kernel driver I/O block size hinting
On Wed, Jun 15, 2022 at 11:09:31AM +0100, Richard W.M. Jones wrote:> This kernel setting limits the maximum request size on the queue. > > In my testing reading and writing files with the default [above] the > kernel never got anywhere near sending multi-megabyte requests. In > fact the largest request it sent was 128K, even when I did stuff like: > > # dd if=/dev/zero of=/tmp/mnt/zero bs=100M count=10 > > 128K happens to be 2 x blk_queue_io_opt, but I need to do more testing > to see if that relationship always holds.The answer is apparently no. With minimum_io_size == 64K and optimal_io_size == 256K, the server still only sees at most 128K requests. Although I still think we need to make these changes to nbd.ko, I don't think this is going to solve the original problem of trying to aggregate requests into the very large block sizes favoured by S3. (nbdkit blocksize filter + a layer of caching seems like the way to go for that) Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-builder quickly builds VMs from scratch http://libguestfs.org/virt-builder.1.html
On Jun 15 2022, "Richard W.M. Jones" <rjones at redhat.com> wrote:>> > I think we could set this to MIN (32M, NBD maximum block size constraint), >> > converting the result to sectors. >> >> I don't think that's right. Rather, it should be NBD's preferred block >> size. >> >> Setting this to the preferred block size means that NBD requests will be >> this large whenever there are enough sequential dirty pages, and that no >> requests will ever be larger than this. I think this is exactly what the >> NBD server would like to have. > > This kernel setting limits the maximum request size on the queue.Right. But why not limit it to the *preferred* blocksize of the NBD server? The kernel obviously does not care, and the NBD server obviously prefers this blocksize over the maximum block size.> In my testing reading and writing files with the default [above] the > kernel never got anywhere near sending multi-megabyte requests.Well, yes, but that shouldn't affect which value we should use, I think.>> Unrelated to the proposed changes (all of which I think are technically >> correct), I am wondering if this will have much practical benefits. As >> far as I can tell, the kernel currently aligns NBD requests to the >> logical/physical block size rather than the size of the NBD request. Are >> there NBD servers that would benefit from the kernel honoring the >> preferred blocksize if the data is not also aligned to this blocksize? > > I'm not sure I parsed this. Can you give an example?No - I am asking for examples :-). My question is: in which scenario is it helpful for the NBD server to receive non-aligned requests of its preferred blocksize? Isn't that just as bad as receiving requests with a non-preferred blocksize? Best, -Nikolaus -- GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F ?Time flies like an arrow, fruit flies like a Banana.?