thr3ads.net - zfs discuss - [zfs-discuss] ZFS vs UFS2 overhead and may be a bug? [May 2007]

If this information is useful, please help other people find it:
Share via:

Bakul Shah

2007-May-03 21:15 UTC

[zfs-discuss] ZFS vs UFS2 overhead and may be a bug?

[originally reported for ZFS on FreeBSD but Pawel Jakub Dawid
 says this problem also exists on Solaris hence this email.]

Summary: on ZFS, overhead for reading a hole seems far worse
than actual reading from a disk.  Small buffers are used to
make this overhead more visible.

I ran the following script on both ZFS and UF2 filesystems.

[Note that on FreeBSD cat uses a 4k buffer and md5 uses a 1k
 buffer. On Solaris you can replace them with dd with
 respective buffer sizes for this test and you should see
 similar results.]

$ dd </dev/zero bs=1m count=10240 >SPACY# 10G zero bytes allocated
$ truncate -s 10G HOLEY			# no space allocated

$ time dd <SPACY >/dev/null bs=1m	# A1
$ time dd <HOLEY >/dev/null bs=1m	# A2
$ time cat SPACY >/dev/null		# B1
$ time cat HOLEY >/dev/null		# B2
$ time md5 SPACY			# C1
$ time md5 HOLEY			# C2

I have summarized the results below.

		      ZFS	     UFS2
		Elapsed System	Elapsed	System	       Test
dd SPACY bs=1m  110.26   22.52	340.38	 19.11		A1
dd HOLEY bs=1m   22.44   22.41	 24.24	 24.13		A2

cat SPACY	119.64   33.04	342.77	 17.30		B1
cat HOLEY	222.85  222.08	 22.91	 22.41		B2

md5 SPACY	210.01	 77.46	337.51	 25.54		C1	
md5 HOLEY	856.39	801.21	 82.11	 28.31		C2


A1, A2:
Numbers are more or less as expected.  When doing large
reads, reading from "holes" takes far less time than from a
real disk.  We also see that UFS2 disk is about 3 times
slower for sequential reads.

B1, B2:
UFS2 numbers are as expected but ZFS numbers for the HOLEY
file are much too high.  Why should *not* going to a real
disk cost more?  We also see that UFS2 handles holey files 10
times more efficiently than ZFS!

C1, C2:
Again UFS2 numbers and C1 numbers for ZFS are as expected.
but C2 numbers for ZFS are very high.  md5 uses BLKSIZ (=1k) size reads and does
hardly any other system calls.  For
ZFS each syscall takes 76.4 microseconds while UFS2 syscalls
are 2.7 us each!  zpool iostat shows there is no IO to the
real disk so this implies that for the HOLEY case zfs read
calls have a significantly higher overhead or there is a bug.

Basically C tests just confirm what we find in B tests.

Pawel Jakub Dawidek

2007-May-03 22:21 UTC

head link

[zfs-discuss] ZFS vs UFS2 overhead and may be a bug?

On Thu, May 03, 2007 at 02:15:45PM -0700, Bakul Shah
wrote:> [originally reported for ZFS on FreeBSD but Pawel Jakub Dawid
>  says this problem also exists on Solaris hence this email.]
Thanks!
> Summary: on ZFS, overhead for reading a hole seems far worse
> than actual reading from a disk.  Small buffers are used to
> make this overhead more visible.
> 
> I ran the following script on both ZFS and UF2 filesystems.
> 
> [Note that on FreeBSD cat uses a 4k buffer and md5 uses a 1k
>  buffer. On Solaris you can replace them with dd with
>  respective buffer sizes for this test and you should see
>  similar results.]
> 
> $ dd </dev/zero bs=1m count=10240 >SPACY# 10G zero bytes allocated
> $ truncate -s 10G HOLEY			# no space allocated
> 
> $ time dd <SPACY >/dev/null bs=1m	# A1
> $ time dd <HOLEY >/dev/null bs=1m	# A2
> $ time cat SPACY >/dev/null		# B1
> $ time cat HOLEY >/dev/null		# B2
> $ time md5 SPACY			# C1
> $ time md5 HOLEY			# C2
> 
> I have summarized the results below.
> 
> 		      ZFS	     UFS2
> 		Elapsed System	Elapsed	System	       Test
> dd SPACY bs=1m  110.26   22.52	340.38	 19.11		A1
> dd HOLEY bs=1m   22.44   22.41	 24.24	 24.13		A2
> 
> cat SPACY	119.64   33.04	342.77	 17.30		B1
> cat HOLEY	222.85  222.08	 22.91	 22.41		B2
> 
> md5 SPACY	210.01	 77.46	337.51	 25.54		C1	
> md5 HOLEY	856.39	801.21	 82.11	 28.31		C2
This is what I see on Solaris (hole is 4GB):

	# /usr/bin/time dd if=/ufs/hole of=/dev/null bs=128k
	real       23.7
	# /usr/bin/time dd if=/zfs/hole of=/dev/null bs=128k
	real       21.2

	# /usr/bin/time dd if=/ufs/hole of=/dev/null bs=4k
	real       31.4
	# /usr/bin/time dd if=/zfs/hole of=/dev/null bs=4k
	real     7:32.2

-- 
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd at FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070504/fc3e4b92/attachment.bin>

Matthew Ahrens

2007-May-07 23:04 UTC

head link

[zfs-discuss] ZFS vs UFS2 overhead and may be a bug?

Pawel Jakub Dawidek wrote:> This is what I see on Solaris (hole is 4GB):
> 
> 	# /usr/bin/time dd if=/ufs/hole of=/dev/null bs=128k
> 	real       23.7
> 	# /usr/bin/time dd if=/zfs/hole of=/dev/null bs=128k
> 	real       21.2
> 
> 	# /usr/bin/time dd if=/ufs/hole of=/dev/null bs=4k
> 	real       31.4
> 	# /usr/bin/time dd if=/zfs/hole of=/dev/null bs=4k
> 	real     7:32.2
This is probably because the time to execute this on ZFS is dominated by 
per-systemcall costs, rather than per-byte costs.  You are doing 32x more 
system calls with the 4k blocksize, and it is taking 20x longer.

That said, I could be wrong, and yowtch, that''s much slower than
I''d like!

--matt

Bakul Shah

2007-May-07 23:23 UTC

head link

[zfs-discuss] ZFS vs UFS2 overhead and may be a bug?

> Pawel Jakub Dawidek wrote:
> > This is what I see on Solaris (hole is 4GB):
> > 
> > 	# /usr/bin/time dd if=/ufs/hole of=/dev/null bs=128k
> > 	real       23.7
> > 	# /usr/bin/time dd if=/zfs/hole of=/dev/null bs=128k
> > 	real       21.2
> > 
> > 	# /usr/bin/time dd if=/ufs/hole of=/dev/null bs=4k
> > 	real       31.4
> > 	# /usr/bin/time dd if=/zfs/hole of=/dev/null bs=4k
> > 	real     7:32.2
> 
> This is probably because the time to execute this on ZFS is dominated by 
> per-systemcall costs, rather than per-byte costs.  You are doing 32x more 
> system calls with the 4k blocksize, and it is taking 20x longer.
> 
> That said, I could be wrong, and yowtch, that''s much slower than
I''d like!
You missed my earlier post where I showed accessing a hole
file takes much longer than accessing a regular data file for
blocksize of 4k and below.  I will repeat the most dramatic
difference:

                      ZFS            UFS2
		Elapsed System  Elapsed System 
md5 SPACY       210.01   77.46  337.51   25.54
md5 HOLEY       856.39  801.21   82.11   28.31

I used md5 because all but a couple of syscalls are for
reading the file (with a buffer of 1K).  dd would make
an equal number of calls for writing.

For both file systems and both cases the filesize is the same
but SPACY has 10GB allocated while HOLEY was created with
truncate -s 10G HOLEY.

Look at the system times.  On UFS2 system time is a little
bit more for the HOLEY case because it has to clear a block.
ON ZFS it is over 10 times more!  Something is very wrong.

Robert Milkowski

2007-May-08 08:50 UTC

head link

[zfs-discuss] ZFS vs UFS2 overhead and may be a bug?

Hello Matthew,

Tuesday, May 8, 2007, 1:04:56 AM, you wrote:

MA> Pawel Jakub Dawidek wrote:>> This is what I see on Solaris (hole is 4GB):
>> 
>>       # /usr/bin/time dd if=/ufs/hole of=/dev/null bs=128k
>>       real       23.7
>>       # /usr/bin/time dd if=/zfs/hole of=/dev/null bs=128k
>>       real       21.2
>> 
>>       # /usr/bin/time dd if=/ufs/hole of=/dev/null bs=4k
>>       real       31.4
>>       # /usr/bin/time dd if=/zfs/hole of=/dev/null bs=4k
>>       real     7:32.2
MA> This is probably because the time to execute this on ZFS is dominated by
MA> per-systemcall costs, rather than per-byte costs.  You are doing 32x more
MA> system calls with the 4k blocksize, and it is taking 20x longer.

MA> That said, I could be wrong, and yowtch, that''s much slower than
I''d like!


But 4k for UFS took him 31s while 4k for ZFS took him 14x more time!
In both cases the same number of syscalls were executed.

-- 
Best regards,
 Robert                            mailto:rmilkowski at task.gda.pl
                                       http://milek.blogspot.com

Matthew Ahrens

2007-May-08 18:03 UTC

head link

[zfs-discuss] ZFS vs UFS2 overhead and may be a bug?

Robert Milkowski wrote:> Hello Matthew,
> 
> Tuesday, May 8, 2007, 1:04:56 AM, you wrote:
> 
> MA> Pawel Jakub Dawidek wrote:
>>> This is what I see on Solaris (hole is 4GB):
>>>
>>>       # /usr/bin/time dd if=/ufs/hole of=/dev/null bs=128k
>>>       real       23.7
>>>       # /usr/bin/time dd if=/zfs/hole of=/dev/null bs=128k
>>>       real       21.2
>>>
>>>       # /usr/bin/time dd if=/ufs/hole of=/dev/null bs=4k
>>>       real       31.4
>>>       # /usr/bin/time dd if=/zfs/hole of=/dev/null bs=4k
>>>       real     7:32.2
> 
> MA> This is probably because the time to execute this on ZFS is
dominated by
> MA> per-systemcall costs, rather than per-byte costs.  You are doing 32x
more
> MA> system calls with the 4k blocksize, and it is taking 20x longer.
> 
> MA> That said, I could be wrong, and yowtch, that''s much slower
than I''d like!
> 
> 
> But 4k for UFS took him 31s while 4k for ZFS took him 14x more time!
> In both cases the same number of syscalls were executed.
So, I''m hearing two claims here:  ZFS is much slower than UFS when 
reading a sparse file, and ZFS is much slower at reading a sparse file 
than a filled-in file.  However, I am not able to reproduce these results.

On a 2-CPU, 2.2GHz Opteron, with most recent Nevada bits, nondebug:

on ZFS, 4k recordsize, ptime dd if=file of=/dev/null bs=4k
sparse: 1.3sec real
filled, cached: 0.9sec real

on ZFS, 1k recordsize, ptime dd if=file of=/dev/null bs=1k
sparse: 5.4sec real
filled, cached: 3.5sec real

on UFS, (4k blocksize), ptime dd if=file of=/dev/null bs=4k
sparse, cached: 0.8sec real
filled, cached: 0.8sec real

In summary:

ZFS is ~40% slower than UFS when reading sparse files (with 4k 
block/recordsize).

ZFS is ~40% slower reading a sparse file than a cached filled-in file 
(with 4k or 1k recordsize).  This is because we don''t cache the sparse 
buffers, so we spend more time instantiating and zeroing them.

So, I''m not sure how you are getting your 20x number.  Are you sure
that
you aren''t using debug bits or something?

--matt

Apparently Analagous Threads

Search for more reasonably related threads

zfs discuss - May 2007 - ZFS vs UFS2 overhead and may be a bug?

[zfs-discuss] ZFS vs UFS2 overhead and may be a bug?

[zfs-discuss] ZFS vs UFS2 overhead and may be a bug?

[zfs-discuss] ZFS vs UFS2 overhead and may be a bug?

[zfs-discuss] ZFS vs UFS2 overhead and may be a bug?

[zfs-discuss] ZFS vs UFS2 overhead and may be a bug?

[zfs-discuss] ZFS vs UFS2 overhead and may be a bug?

Apparently Analagous Threads