thr3ads.net - zfs discuss - [zfs-discuss] [perf-discuss] ZFS performance issue

If this information is useful, please help other people find it:
Share via:

Jim Mauro

2009-Mar-30 17:26 UTC

[zfs-discuss] [perf-discuss] ZFS performance issue - READ is slow as hell...

Cross posting to zfs-discuss.

By my math, here''s what you''re getting;

4.6MB/sec on writes to ZFS.
2.2MB/sec on reads from ZFS.
90MB/sec on read from block device.


What is c0t1d0 - I assume it''s a hardware RAID LUN,
but how many disks, and what type of LUN?

What version of Solaris (cat /etc/release)?

Please send "zpool status" output.

I''m not yet sure what''s broken here, but there''s
something
pathologically wrong with the IO rates to the device
during the ZFS tests. In both cases, the wait queue is
getting backed up, with horrific wait queue latency numbers.
On the read side, I don''t understand why we''re seeing
4-5 seconds of zero disk activity on the read test in between
bursts of a small number of reads.

I just did a quick test on an X4600 (older 8 socket AMD box),
running Solaris nv103. Single disk ZFS.

For writes;
# ptime dd if=/dev/urandom of=/tzp/TESTFILE bs=1024k count=512
512+0 records in
512+0 records out

real       11.869
user        0.001
sys        11.861
#
# bc -l
(1024*1024*512) / 11.9
45115202.68907563025210084033

So that''s 45MB/sec on the write. Did an unmount/mount
of the ZFS, and the read;

# ptime dd if=/tzp/TESTFILE of=/dev/null bs=1024k
512+0 records in
512+0 records out

real        2.696
user        0.000
sys         0.411
# bc -l
(1024*1024*512) / 2.69
199580264.68401486988847583643


So that''s about 200MB/sec on the read. I did this several times,
with unmounts/mounts in between, to make sure I can
replicate the numbers.


Do me a favor - capture "kstat -n arcstats" before a test,
after the write test and after the read test.

Sorry - I need to think about this a bit more.
Something is seriously broken, but I''m not yet
sure what it is. Unless you''re running an older
Solaris version, and/or missing patches.

Thanks,
/jim

Jim Mauro

2009-Mar-30 17:42 UTC

head link

[zfs-discuss] [perf-discuss] ZFS performance issue - READ is slow as hell...

I don''t understand why disabling ZFS prefetch solved this
problem. The test case was a single threaded sequential write, followed
by a single threaded sequential read.

Anyone listening on ZFS have an explanation as to why disabling
prefetch solved Roland''s very poor bandwidth problem?

My only thoughy here is you were tripping over a bug in
the prefetch code. I''m investigating that now.

Thanks,
/jim

roland wrote:> Hello Ben, 
>
>   
>> If you want to put this to the test, consider disabling prefetch and
>> trying again. See
http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide
>>     
>
> i should have read and follow advice better - this was the essential hint.
thanks very much.
>
> after issuing
> echo zfs_prefetch_disable/W0t1 | mdb -kw 
>
> the problem immediately went away !
>
> so, why does solaris have such stupid default set ? 
>
> is my workload that special? i cannot believe that i`m the only one with
this problem or the only one using zfs for backing up large files....
>

Marion Hakanson

2009-Mar-31 18:29 UTC

head link

[zfs-discuss] [perf-discuss] ZFS performance issue - READ is slow as hell...

James.Mauro at Sun.COM said:> I''m not yet sure what''s broken here, but there''s
something pathologically
> wrong with the IO rates to the device during the ZFS tests. In both cases,
> the wait queue is getting backed up, with horrific wait queue latency
> numbers. On the read side, I don''t understand why we''re
seeing 4-5 seconds of
> zero disk activity on the read test in between bursts of a small number of
> reads. 
We observed such long pauses (with zero disk activity) with a disk array
that was being fed more operations than it could handle (FC queue depth).
The array was not losing ops, but the OS would fill the device''s queue
and then the OS would completely freeze on any disk-related activity for
the affected LUN''s.  All zpool or zfs commands related to those pools
would
be unresponsive during those periods, until the load slowed down enough
such that the OS wasn''t ahead of the array.

This was with Solaris-10 here, not OpenSolaris or SXCE, but I suspect
the principal would still apply.  Naturally, the original poster may have
a very different situation, so take the above as you wish.  Maybe Dtrace
can help:
	http://blogs.sun.com/chrisg/entry/latency_bubble_in_your_io
	http://blogs.sun.com/chrisg/entry/latency_bubbles_follow_up
	http://blogs.sun.com/chrisg/entry/that_we_should_make

Note that using the above references, Dtrace showed that we had some
FC operations which took 60 or even 120 seconds to complete.  Things got
much better here when we zeroed in on two settings:

  (a) set FC queue depth for the device to match its backend capacity (4).
  (b) turn off sorting of the queue by the OS/driver (latency evened out).

Regards,

Marion

zfs discuss - Mar 2009 - [perf-discuss] ZFS performance issue - READ is slow as hell...

[zfs-discuss] [perf-discuss] ZFS performance issue - READ is slow as hell...

[zfs-discuss] [perf-discuss] ZFS performance issue - READ is slow as hell...

[zfs-discuss] [perf-discuss] ZFS performance issue - READ is slow as hell...