t. johnson
2009-Jul-22 19:10 UTC
[zfs-discuss] virtualization, alignment and zfs variation stripes
One of the things that commonly comes up in the server virtualization world is making sure that all of the storage elements are "aligned". This is because there are often so many levels of abstraction each using their own "block size" that without any tuning, they''ll usually overlap and can cause 2 or even 3 times the I/O in some cases to read what would be just one block. I guess this was also a common thing in the SAN world many years back. Lets say I have a simple-ish setup that uses vmware files for virtual disks on an NFS share from zfs. I''m wondering how zfs'' variable block size comes into play? Does it make the alignment problem go away? Does it make it worse? Or should we perhaps be creating filesystems with a fixed block size for virtualization workloads? -- This message posted from opensolaris.org
Bob Friesenhahn
2009-Jul-22 19:45 UTC
[zfs-discuss] virtualization, alignment and zfs variation stripes
On Wed, 22 Jul 2009, t. johnson wrote:> Lets say I have a simple-ish setup that uses vmware files for > virtual disks on an NFS share from zfs. I''m wondering how zfs'' > variable block size comes into play? Does it make the alignment > problem go away? Does it make it worse? Or should we perhaps beMy understanding is that zfs uses fixed block sizes except for the tail block of a file, or if the filesystem has compression enabled. Zfs''s large blocks can definitely cause performance problems if the system has insufficient memory to cache the blocks which are accessed, or only part of the block is updated. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
thomas
2009-Jul-23 05:29 UTC
[zfs-discuss] virtualization, alignment and zfs variation stripes
Hmm.. I guess that''s what I''ve heard as well. I do run compression and believe a lot of others would as well. So then, it seems to me that if I have guests that run a filesystem formatted with 4k blocks for example.. I''m inevitably going to have this overlap when using ZFS network storage? So if "A" were zfs blocks and "B" were virtualized guest blocks, I think it might look like this with compression on? | B1 | B2 | B3 | B4 | | A1 | A2 | A3 | A4 | So if the guest OS wants blocks B2 or B4, it actually has to read 2 blocks from the underlying zfs storage? -- This message posted from opensolaris.org
Nicolas Williams
2009-Jul-23 05:49 UTC
[zfs-discuss] virtualization, alignment and zfs variation stripes
On Wed, Jul 22, 2009 at 02:45:52PM -0500, Bob Friesenhahn wrote:> On Wed, 22 Jul 2009, t. johnson wrote: > >Lets say I have a simple-ish setup that uses vmware files for > >virtual disks on an NFS share from zfs. I''m wondering how zfs'' > >variable block size comes into play? Does it make the alignment > >problem go away? Does it make it worse? Or should we perhaps be > > My understanding is that zfs uses fixed block sizes except for the > tail block of a file, or if the filesystem has compression enabled.For one block files, the block is variable, between 512 bytes and the smaller of the dataset''s recordsize or 128KB. For multi-block files all blocks are the same size, except the tail block. But these are sizes in file data, not actual on-disk sizes (which can be less because of compression).> Zfs''s large blocks can definitely cause performance problems if the > system has insufficient memory to cache the blocks which are accessed, > or only part of the block is updated.You should set the virtual disk image files'' recordsize (or, rather, the containing dataset''s recordsize) to match the preferred block size of the filesystem types (or data) that you''ll put on those virtual disks. Nico --
Fajar A. Nugraha
2009-Jul-23 05:49 UTC
[zfs-discuss] virtualization, alignment and zfs variation stripes
On Thu, Jul 23, 2009 at 12:29 PM, thomas<no-reply at opensolaris.org> wrote:> Hmm.. I guess that''s what I''ve heard as well. > > I do run compression and believe a lot of others would as well. So then, it seems > to me that if I have guests that run a filesystem formatted with 4k blocks for > example.. I''m inevitably going to have this overlap when using ZFS network > storage? > > So if "A" were zfs blocks and "B" were virtualized guest blocks, I think it might > look like this with compression on? > > | ? B1 ? | ? B2 ? | ? B3 ? | ? B4 ? | > | ? A1 ? | A2 | ? ? A3 ? ? ?| ?A4 ?| > > So if the guest OS wants blocks B2 or B4, it actually has to read 2 blocks from the > underlying zfs storage?AFAIK If you use zvol, and set zfs volblocksize to be the same as the fs block size on virtualized system (which is 4k by default for several GB disk/partition with ext3/ntfs), every virtualized block read should correspond to one zfs block read. If you set compression on, the actual bytes read from the storage will not always be 4k though, it can be less depending on how compressible the data is. -- Fajar