thr3ads.net - zfs discuss - [zfs-discuss] Peculiarities of COW over COW? [Apr 2009]

If this information is useful, please help other people find it:
Share via:

Gary Mills

2009-Apr-26 20:52 UTC

[zfs-discuss] Peculiarities of COW over COW?

We run our IMAP spool on ZFS that''s derived from LUNs on a Netapp
filer.  There''s a great deal of churn in e-mail folders, with messages
appearing and being deleted frequently.  I know that ZFS uses copy-on-
write, so that blocks in use are never overwritten, and that deleted
blocks are added to a free list.  This behavior would spread the free
list all over the zpool.  As well, the Netapp uses WAFL, also a
variety of copy-on-write.  The LUNs appear as large files on the
filer.  It won''t know which blocks are in use by ZFS.  It would have
to do copy-on-write each time, I suppose.  Do we have a problem here?

The Netapp has a utility that will defragment files on a volume.  It
must put them back into sequential order.  Does ZFS have any concept
of the geometry of its disks?  If so, regular degragmentation on the
Netapp might be a good thing.

Should ZFS and the Netapp be using the same blocksize, so that they
cooperate to some extent?

-- 
-Gary Mills-    -Unix Support-    -U of M Academic Computing and Networking-

Tim

2009-Apr-26 22:02 UTC

head link

[zfs-discuss] Peculiarities of COW over COW?

On Sun, Apr 26, 2009 at 3:52 PM, Gary Mills <mills at cc.umanitoba.ca>
wrote:
> We run our IMAP spool on ZFS that''s derived from LUNs on a Netapp
> filer.  There''s a great deal of churn in e-mail folders, with
messages
> appearing and being deleted frequently.  I know that ZFS uses copy-on-
> write, so that blocks in use are never overwritten, and that deleted
> blocks are added to a free list.  This behavior would spread the free
> list all over the zpool.  As well, the Netapp uses WAFL, also a
> variety of copy-on-write.  The LUNs appear as large files on the
> filer.  It won''t know which blocks are in use by ZFS.  It would
have
> to do copy-on-write each time, I suppose.  Do we have a problem here?
>
Not at all.

>
> The Netapp has a utility that will defragment files on a volume.  It
> must put them back into sequential order.  Does ZFS have any concept
> of the geometry of its disks?  If so, regular degragmentation on the
> Netapp might be a good thing.

I assume you mean reallocate on the filer?  This is run automatically as
part of weekly maintenance.  There are flags to run it more aggressively,
but unless you''re actually seeing problems, I would suggest avoiding
doing
so.

>
>
> Should ZFS and the Netapp be using the same blocksize, so that they
> cooperate to some extent?
>
Just make sure ZFS is using a block size that is a multiple of 4k, which I
believe it does by default.

I have to ask though... why not just serve NFS off the filer to the Solaris
box?  ZFS on a LUN served off a filer seems to make about as much sense as
sticking a ZFS based lun behind a v-filer (although the latter might
actually might make sense in a world where it were supported
*cough*neverhappen*cough* since you could buy the "cheap" newegg
disk).


--Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090426/c54ebc7e/attachment.html>

Miles Nordin

2009-Apr-26 22:46 UTC

head link

[zfs-discuss] Peculiarities of COW over COW?

>>>>> "t" == Tim  <tim at tcsac.net> writes:
     t> why not just serve NFS off the filer

there can be some benefit to the lossless FC fabric through
eliminating TCP RTO''s and applying backpressure so the initiator has
more control over I/O scheduling.

As discussed here, block-based storage can produce fewer synchronous
calls / rtt waits than NFS for workloads involving opening and closing
lots of small files when you are not calling fsync on them.  

I state both based on theory not experience, and I''m not saying
that''s
Gary''s workload falls in the second category, nor that NFS is
necessarily the wrong approach, but here are two reasons a sane person
might plausibly decide to use the LUN interface instead.  I''m sure
there are more arguments for and against.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 304 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090426/b49bacf3/attachment.bin>

Gary Mills

2009-Apr-26 23:13 UTC

head link

[zfs-discuss] Peculiarities of COW over COW?

On Sun, Apr 26, 2009 at 05:19:18PM -0400, Ellis, Mike wrote:
> As soon as you put those zfs blocks ontop of iscsi, the netapp
won''t
> have a clue as far as how to defrag those "iscsi files" from the
> filer''s perspective.  (It might do some fancy stuff based on
> read/write patterns, but that''s unlikely)
Since the LUN is just a large file on the Netapp, I assume that all
it can do is to put the blocks back into sequential order.  That might
have some benefit overall.

-- 
-Gary Mills-    -Unix Support-    -U of M Academic Computing and Networking-

Gary Mills

2009-Apr-26 23:17 UTC

head link

[zfs-discuss] Peculiarities of COW over COW?

On Sun, Apr 26, 2009 at 05:02:38PM -0500, Tim wrote:> 
>    On Sun, Apr 26, 2009 at 3:52 PM, Gary Mills <[1]mills at
cc.umanitoba.ca>
>    wrote:
>    
>      We run our IMAP spool on ZFS that''s derived from LUNs on a
Netapp
>      filer.  There''s a great deal of churn in e-mail folders, with
>      messages
>      appearing and being deleted frequently.
>      Should ZFS and the Netapp be using the same blocksize, so that they
>      cooperate to some extent?
>      
>    Just make sure ZFS is using a block size that is a multiple of 4k,
>    which I believe it does by default.
Okay, that''s good.
>    I have to ask though... why not just serve NFS off the filer to the
>    Solaris box?  ZFS on a LUN served off a filer seems to make about as
>    much sense as sticking a ZFS based lun behind a v-filer (although the
>    latter might actually might make sense in a world where it were
>    supported *cough*neverhappen*cough* since you could buy the
"cheap"
>    newegg disk).
I prefer NFS too, but the IMAP server requires POSIX semantics.
I believe that NFS doesn''t support that, at least NFS version 3.

-- 
-Gary Mills-    -Unix Support-    -U of M Academic Computing and Networking-

Richard Elling

2009-Apr-27 00:50 UTC

head link

[zfs-discuss] Peculiarities of COW over COW?

Gary Mills wrote:> We run our IMAP spool on ZFS that''s derived from LUNs on a Netapp
> filer.  There''s a great deal of churn in e-mail folders, with
messages
> appearing and being deleted frequently.  I know that ZFS uses copy-on-
> write, so that blocks in use are never overwritten, and that deleted
> blocks are added to a free list.  This behavior would spread the free
> list all over the zpool.  As well, the Netapp uses WAFL, also a
> variety of copy-on-write.  The LUNs appear as large files on the
> filer.  It won''t know which blocks are in use by ZFS.  It would
have
> to do copy-on-write each time, I suppose.  Do we have a problem here?
>
> The Netapp has a utility that will defragment files on a volume.  It
> must put them back into sequential order.  Does ZFS have any concept
> of the geometry of its disks?  If so, regular degragmentation on the
> Netapp might be a good thing.
>   
If you measure this, then please share your results. There is much
speculation, but little characterization, of the "ills of COW
performance."
> Should ZFS and the Netapp be using the same blocksize, so that they
> cooperate to some extent?
>
>   
ZFS blocksize is dynamic, power of 2, with a max size == recordsize.
Writes can also be coalesced. If you want to measure the distribution, then
there are a few DTrace scripts which will measure it (eg. iosnoop)

I did a large e-mail server over ZFS POC earlier this year.  We could
handle more than 250,000 users on a T5120 message store server using
decent storage (lots of spindles). Since the I/O workload for IMAP is
quite a unique and demanding workload, we were very pleased with
how well ZFS worked.  But low-latency storage is key to maintaining
such large workloads.
 -- richard

Tomas Ögren

2009-Apr-27 06:13 UTC

head link

[zfs-discuss] Peculiarities of COW over COW?

On 26 April, 2009 - Gary Mills sent me these 1,3K bytes:
> On Sun, Apr 26, 2009 at 05:02:38PM -0500, Tim wrote:
> >    I have to ask though... why not just serve NFS off the filer to the
> >    Solaris box?  ZFS on a LUN served off a filer seems to make about
as
> >    much sense as sticking a ZFS based lun behind a v-filer (although
the
> >    latter might actually might make sense in a world where it were
> >    supported *cough*neverhappen*cough* since you could buy the
"cheap"
> >    newegg disk).
> 
> I prefer NFS too, but the IMAP server requires POSIX semantics.
> I believe that NFS doesn''t support that, at least NFS version 3.
What non-POSIXness are you referring to, or is it just random old
thoughts that actually doesn''t apply?

Lots of people (me for instance) are using IMAP servers with data served
over NFSv3..

/Tomas
-- 
Tomas ?gren, stric at acc.umu.se, http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Ume?
`- Sysadmin at {cs,acc}.umu.se

Jeff Bonwick

2009-Apr-27 08:12 UTC

head link

[zfs-discuss] Peculiarities of COW over COW?

> ZFS blocksize is dynamic, power of 2, with a max size == recordsize.
Minor clarification: recordsize is restricted to powers of 2, but
blocksize is not -- it can be any multiple of sector size (512 bytes).
For small files, this matters: a 37k file is stored in a 37k block.
For larger, multi-block files, the size of each block is indeed a
power of 2 (simplifies the math a bit).

Jeff

David Magda

2009-Apr-27 18:06 UTC

head link

[zfs-discuss] Peculiarities of COW over COW?

On Mon, April 27, 2009 02:13, Tomas ?gren wrote:> On 26 April, 2009 - Gary Mills sent me these 1,3K bytes:
>
>> I prefer NFS too, but the IMAP server requires POSIX semantics.
>> I believe that NFS doesn''t support that, at least NFS version
3.
>
> What non-POSIXness are you referring to, or is it just random old
> thoughts that actually doesn''t apply?
>
> Lots of people (me for instance) are using IMAP servers with data served
> over NFSv3..
Depends on the IMAP server. Cyrus for example doesn''t recommend/support
it:
> Using NFS: We don''t recommend it. If you want to do it, it may
possibly
> work but you may also lose your email or have corrupted cyrus.* files.
> You can look at the mailing list archives for more information.
http://cyrusimap.web.cmu.edu/imapd/faq.html

As for non-POSIXness:
> In fact, because XNFS provides transparent access to remote files, it is
> not possible for a process to distinguish between local and remote files
> before they are used. Due to the nature of the way XNFS works, there are
> some semantic differences between operations on local files and
> equivalent operations on remote files.
>
> This appendix gives a summary of these semantic differences. Together
> with "Open-System Interface Semantics over XNFS" and "Open
System
> Utilities Semantics over XNFS" this appendix specifies differences
that
> can occur when using a given utility or function with a file on a remote
> file system.
http://www.opengroup.org/onlinepubs/9629799/apdxa.htm

It''s copyright 1998, and only refers to NFSv2 and v3, so it may be out
of
date (especially with NFSv4[.1]).

Robert Milkowski

2009-Apr-27 23:13 UTC

head link

[zfs-discuss] Peculiarities of COW over COW?

Hello Jeff,

Monday, April 27, 2009, 9:12:26 AM, you wrote:
>> ZFS blocksize is dynamic, power of 2, with a max size == recordsize.
JB> Minor clarification: recordsize is restricted to powers of 2, but
JB> blocksize is not -- it can be any multiple of sector size (512 bytes).
JB> For small files, this matters: a 37k file is stored in a 37k block.

JB> For larger, multi-block files, the size of each block is indeed a
JB> power of 2 (simplifies the math a bit).

which is a consequence of recordsize being a power of 2 and
multi-block files usually (always?) will have a block size equvalent
to recordsize value.

Has the issue with tail block been fixed yet?


-- 
Best regards,
 Robert Milkowski
                                       http://milek.blogspot.com

Reasonably Related Threads

Search for more seemingly similar threads

zfs discuss - Apr 2009 - Peculiarities of COW over COW?

[zfs-discuss] Peculiarities of COW over COW?

[zfs-discuss] Peculiarities of COW over COW?

[zfs-discuss] Peculiarities of COW over COW?

[zfs-discuss] Peculiarities of COW over COW?

[zfs-discuss] Peculiarities of COW over COW?

[zfs-discuss] Peculiarities of COW over COW?

[zfs-discuss] Peculiarities of COW over COW?

[zfs-discuss] Peculiarities of COW over COW?

[zfs-discuss] Peculiarities of COW over COW?

[zfs-discuss] Peculiarities of COW over COW?

Reasonably Related Threads