thr3ads.net - zfs discuss - [zfs-discuss] Terminology question on ZFS COW [Jun 2012]

If this information is useful, please help other people find it:
Share via:

Jim Klimov

2012-Jun-05 10:32 UTC

[zfs-discuss] Terminology question on ZFS COW

Hello all,

   I recently heard an argument from a colleague that "ZFS mis-uses
the term COW" (Copy-On-Write). According to him, the original term
was introduced by some vendors and was to be taken literally: that
is, whenever a new write comes to update an existing logical block
in the storage, the block''s old contents are first copied away to
another physical location (i.e. to be used for snapshotting or for
recovery of untimely poweroff/panic), then the original on-disk
location is rewritten with the new data.

   Arguably, while this incurs a hit when rewriting existing data,
this combats fragmentation and speeds up reads (i.e. all pieces of
the file''s "live" version are stored as contiguously as
possible).
This may be important for large objects randomly updated "inside",
like VM disk images and iSCSI backing stores, precreated database
table files, maybe swapfiles, etc.

   I understand why ZFS does what it does, and how, but it may be
possible that such subtle differences in terminology may cause
misunderstanding between people of the same trade. At least, I''d
keep this possibility in mind when talking to non-Solaris storage
admins ;)

   I wonder if this use of the term is indeed more valid (making a
copy of old data upon a new write), and if any vendors actually
did that procedure outlined above?

Thanks,
//Jim Klimov

Edward Ned Harvey

2012-Jun-05 12:59 UTC

head link

[zfs-discuss] Terminology question on ZFS COW

> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
> bounces at opensolaris.org] On Behalf Of Jim Klimov
> 
>    I recently heard an argument from a colleague that "ZFS mis-uses
> the term COW" (Copy-On-Write). According to him, the original term
> was introduced by some vendors and was to be taken literally: that
> is, whenever a new write comes to update an existing logical block
> in the storage, the block''s old contents are first copied away to
> another physical location (i.e. to be used for snapshotting or for
> recovery of untimely poweroff/panic), then the original on-disk
> location is rewritten with the new data.
What you described (actually copying the disk sectors upon request to
overwrite the disk sectors) is what MS does.  It may seem more intuitive to
call this COW, in a "files" perspective, but COW is a computer science
term
that was used in memory before it was ever used for disk.  The ZFS behavior
follows the traditional meaning of COW in regards to memory management.

http://en.wikipedia.org/wiki/Copy-on-write

>    Arguably, while this incurs a hit when rewriting existing data,
> this combats fragmentation and speeds up reads (i.e. all pieces of
> the file''s "live" version are stored as contiguously as
possible).
> This may be important for large objects randomly updated
"inside",
> like VM disk images and iSCSI backing stores, precreated database
> table files, maybe swapfiles, etc.
Correct.  Pay now or pay later.  In some cases, pay now is better for the
long run, and in some cases, pay later is better for the long run.

Paul Kraus

2012-Jun-05 13:24 UTC

head link

[zfs-discuss] Terminology question on ZFS COW

On Tue, Jun 5, 2012 at 6:32 AM, Jim Klimov <jimklimov at cos.ru> wrote:
> ?I recently heard an argument from a colleague that "ZFS mis-uses
> the term COW" (Copy-On-Write). According to him, the original term
> was introduced by some vendors and was to be taken literally: that
> is, whenever a new write comes to update an existing logical block
> in the storage, the block''s old contents are first copied away to
> another physical location (i.e. to be used for snapshotting or for
> recovery of untimely poweroff/panic), then the original on-disk
> location is rewritten with the new data.
    This is what I have seen "traditional" filesystems (UFS, VxFS) do
in when dealing with snapshots. Once a snapshot is taken, for any data
that is being re-written, a copy of the original must be made before
committing the write.
> ?Arguably, while this incurs a hit when rewriting existing data,
    The hit to write performance can be substantial and the space to
store each snapshot''s data can also be large. This is one of the big
differences between ZFS and others. The cost (both write performance
and space) for snapshots in ZFS is minimal while for traditional
filesystems it can be huge (depending on the number of snapshots).
> this combats fragmentation and speeds up reads (i.e. all pieces of
> the file''s "live" version are stored as contiguously as
possible).
    As long as the file has not grown beyond the original allocation
segment. Once you grow out of that you are (usually) fragmented.
> This may be important for large objects randomly updated
"inside",
> like VM disk images and iSCSI backing stores, precreated database
> table files, maybe swapfiles, etc.
-- 
{--------1---------2---------3---------4---------5---------6---------7---------}
Paul Kraus
-> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
-> Assistant Technical Director, LoneStarCon 3 (http://lonestarcon3.org/)
-> Sound Coordinator, Schenectady Light Opera Company (
http://www.sloctheater.org/ )
-> Technical Advisor, Troy Civic Theatre Company
-> Technical Advisor, RPI Players

Nico Williams

2012-Jun-05 15:29 UTC

head link

[zfs-discuss] Terminology question on ZFS COW

COW goes back at least to the early days of virtual memory and fork().
 On fork() the kernel would arrange for writable pages in the parent
process to be made read-only so that writes to them could be caught
and then the page fault handler would copy the page (and restore write
access) so the parent and child each have their own private copies.
COW as used in ZFS is not the same, but the concept was introduced
very early also, IIRC in the mid-80s -- certainly no later than
BSD4.4''s log structure filesystem (which ZFS resembles in many ways).

So, is COW a misnomer?  Yes and no, and anyways, it''s irrelevant.  The
important thing is that when you say COW people understand that you''re
not saving a copy of the old thing but rather writing the new thing to
a new location.  (The old version of whatever was copied-on-write is
stranded, unless -of course- you have references left to it from
things like snapshots.)

Nico
--

zfs discuss - Jun 2012 - Terminology question on ZFS COW

[zfs-discuss] Terminology question on ZFS COW

[zfs-discuss] Terminology question on ZFS COW

[zfs-discuss] Terminology question on ZFS COW

[zfs-discuss] Terminology question on ZFS COW