thr3ads.net - zfs discuss - [zfs-discuss] ZFS effective short-stroking and connection to thin provisioning? [Apr 2010]

If this information is useful, please help other people find it:
Share via:

valrhona at gmail.com

2010-Apr-11 06:32 UTC

[zfs-discuss] ZFS effective short-stroking and connection to thin provisioning?

A theoretical question on how ZFS works, for the experts on this board.
I am wondering about how and where ZFS puts the physical data on a mechanical
hard drive. In the past, I have spent lots of money on 15K rpm SCSI and then SAS
drives, which of course have great performance. However, given the increase in
areal density in modern consumer SATA drives, similar performance can be reached
by short-stroking the drives; that is, the outermost tracks are similar in
performance to the average performance, and sometimes exceeding the peak, on the
15K drives.

My question is how ZFS lays the data out on the disk, and if there''s a
way to capture some of this effectively. It seems inefficient to do physically
short-stroke any of the drives, but more sensible to have ZFS handle this (if in
fact it has this capability). But if I am using mirrored pairs of 2 TB drives,
but only have a few hundred GB of data, in effect if only the outer tracks are
used, then the performance should be similar to if I have nearly-full 15 K
drives, in practice. Given that ZFS can also thin provision, thereby
disconnecting the virtual space and physical space on the drives, how does the
data layout maximize performance?

The practical question: I have something like 600 GB of data on a mirrored pair
of 2 TB Hitachi SATA drives, with compression and deduplication. Before, I had a
RAID5 of four 147 GB 10K rpm Seagate Savvio 10K.2 2.5" SAS drives on a Dell
PERC5/i caching RAID controller. The old RAID was nearly full (20-30 GB free),
and performed substantially slower than the current setup in daily use (it had
noticeably slower disk access, and transfer rates), because the drives were
nearly full. I''m curious to see if I switched from these two disks to
the new Western Digital Velociraptors (10K RPM SATA), if I could even tell the
difference. Or because those drives would be nearly full, would the whole setup
be slower?
-- 
This message posted from opensolaris.org

Richard Elling

2010-Apr-11 15:50 UTC

head link

[zfs-discuss] ZFS effective short-stroking and connection to thin provisioning?

On Apr 10, 2010, at 11:32 PM, valrhona at gmail.com
wrote:> A theoretical question on how ZFS works, for the experts on this board.
> I am wondering about how and where ZFS puts the physical data on a
mechanical hard drive. In the past, I have spent lots of money on 15K rpm SCSI
and then SAS drives, which of course have great performance. However, given the
increase in areal density in modern consumer SATA drives, similar performance
can be reached by short-stroking the drives; that is, the outermost tracks are
similar in performance to the average performance, and sometimes exceeding the
peak, on the 15K drives.
HDDs and performance do not mix.  SSDs win.  Game over.
> My question is how ZFS lays the data out on the disk, and if
there''s a way to capture some of this effectively. It seems inefficient
to do physically short-stroke any of the drives, but more sensible to have ZFS
handle this (if in fact it has this capability). But if I am using mirrored
pairs of 2 TB drives, but only have a few hundred GB of data, in effect if only
the outer tracks are used, then the performance should be similar to if I have
nearly-full 15 K drives, in practice. Given that ZFS can also thin provision,
thereby disconnecting the virtual space and physical space on the drives, how
does the data layout maximize performance?
In general, the space with the lower numbered LBA is allocated first.  For many
HDDs, the lower numbered LBAs are on the outer cylinders.  An easy way to see
the allocations at a high level is to look at the metaslab statistics in
	
# zdb -m syspool 
Metaslabs: 
	vdev 0 metaslabs 148 offset spacemap free 
	--------------- ------------------- --------------- ------------- 
	metaslab 0 offset 0 spacemap 26 free 476M 
	metaslab 1 offset 40000000 spacemap 41 free 481M 
	metaslab 2 offset 80000000 spacemap 44 free 974M 
	metaslab 3 offset c0000000 spacemap 45 free 935M 
	metaslab 4 offset 100000000 spacemap 46 free 1007M 
	metaslab 5 offset 140000000 spacemap 110 free 935M 
	metaslab 6 offset 180000000 spacemap 111 free 1019M 
	metaslab 7 offset 1c0000000 spacemap 0 free 1G 
	metaslab 8 offset 200000000 spacemap 0 free 1G 
	metaslab 9 offset 240000000 spacemap 0 free 1G 
...
	metaslab 27 offset 6c0000000 spacemap 0 free 1G 
	metaslab 28 offset 700000000 spacemap 25 free 1012M 
	metaslab 29 offset 740000000 spacemap 40 free 1011M 
	metaslab 30 offset 780000000 spacemap 0 free 1G 
	metaslab 31 offset 7c0000000 spacemap 0 free 1G 
	metaslab 32 offset 800000000 spacemap 0 free 1G 
...

Most of the data is allocated in lower numbered metaslabs.  A bit later
you can see where the redundanct metadata is written. The rest is mostly
free space.

Remember that ZFS uses COW, so new writes will be to the free areas.
> The practical question: I have something like 600 GB of data on a mirrored
pair of 2 TB Hitachi SATA drives, with compression and deduplication. Before, I
had a RAID5 of four 147 GB 10K rpm Seagate Savvio 10K.2 2.5" SAS drives on
a Dell PERC5/i caching RAID controller. The old RAID was nearly full (20-30 GB
free), and performed substantially slower than the current setup in daily use
(it had noticeably slower disk access, and transfer rates), because the drives
were nearly full. I''m curious to see if I switched from these two disks
to the new Western Digital Velociraptors (10K RPM SATA), if I could even tell
the difference. Or because those drives would be nearly full, would the whole
setup be slower?
Yes, the drives will be able to push more media under the head.
It is not clear that this will always give better performance.

Also, for writes, as the pool fills, it becomes more difficult to allocate
free space. This is not a ZFS-only phenomenon, all file systems have 
some sort of allocation. However, there have been improvements in
this area for ZFS over the past year or so.
 -- richard

ZFS storage and performance consulting at http://www.RichardElling.com
ZFS training on deduplication, NexentaStor, and NAS performance
Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com

zfs discuss - Apr 2010 - ZFS effective short-stroking and connection to thin provisioning?

[zfs-discuss] ZFS effective short-stroking and connection to thin provisioning?

[zfs-discuss] ZFS effective short-stroking and connection to thin provisioning?