thr3ads.net - zfs discuss - [zfs-discuss] ZFS extremely slow [Nov 2005]

If this information is useful, please help other people find it:
Share via:

Joerg Schilling

2005-Nov-23 18:33 UTC

[zfs-discuss] ZFS extremely slow

I just did the followimg test:

-	Create two 10 GB files on a UFS partition

-	make them available with lofiadm

-	create zfs on one of them and ufs on the other

-	mount and extract the FreeDB files:
	freedb-complete-20051104.tar.bz2
	This is ~ 1.9 million files in 11 dirs

-	Extraction on UFS: 2:49 real 5280.010seconds System
-	Extraction on ZFS: 9:35 real 2862.890seconds System

-	find is _extremely_ slow on ZFS

	-	it seems that ZFS causes a lot more I/O than UFS
	
	-	sfind . | count did not finish after 4.5 hours
		on ZFS

	-	sfind . | count did finish after 2:45 minutes
		on UFS

	-	the calls to getdents() are _extremely_ slow on ZFS

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de		(uni)  
       schilling at fokus.fraunhofer.de	(work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Bryan Cantrill

2005-Nov-23 18:46 UTC

head link

[zfs-discuss] ZFS extremely slow

On Wed, Nov 23, 2005 at 07:33:05PM +0100, Joerg Schilling
wrote:> I just did the followimg test:
> 
> -	Create two 10 GB files on a UFS partition
> 
> -	make them available with lofiadm
> 
> -	create zfs on one of them and ufs on the other
> 
> -	mount and extract the FreeDB files:
> 	freedb-complete-20051104.tar.bz2
> 	This is ~ 1.9 million files in 11 dirs
> 
> -	Extraction on UFS: 2:49 real 5280.010seconds System
> -	Extraction on ZFS: 9:35 real 2862.890seconds System
> 
> -	find is _extremely_ slow on ZFS
> 
> 	-	it seems that ZFS causes a lot more I/O than UFS
> 	
> 	-	sfind . | count did not finish after 4.5 hours
> 		on ZFS
> 
> 	-	sfind . | count did finish after 2:45 minutes
> 		on UFS
> 
> 	-	the calls to getdents() are _extremely_ slow on ZFS
You _can''t_ be serious -- you''re mounting ON TOP OF ANOTHER
FILESYSTEM
and using that to draw performance conclusions?  Repeat your experiment
on actual, physical spindles; no one is (or should) take this kind of
analysis seriously until then.  This is not to say that there aren''t
issues here (the getdents() issue in particular has been/is being
addressed), just to say that one cannot _assume_ that performance issues
found in this configuration are issues with ZFS.  Please repeat this
test on physical spindles...

	- Bryan

--------------------------------------------------------------------------
Bryan Cantrill, Solaris Kernel Development.       http://blogs.sun.com/bmc

Joerg Schilling

2005-Nov-23 18:56 UTC

head link

[zfs-discuss] ZFS extremely slow

Bryan Cantrill <bmc at eng.sun.com> wrote:
> You _can''t_ be serious -- you''re mounting ON TOP OF
ANOTHER FILESYSTEM
> and using that to draw performance conclusions?  Repeat your experiment
> on actual, physical spindles; no one is (or should) take this kind of
> analysis seriously until then.  This is not to say that there
aren''t
> issues here (the getdents() issue in particular has been/is being
> addressed), just to say that one cannot _assume_ that performance issues
> found in this configuration are issues with ZFS.  Please repeat this
> test on physical spindles...
As _both_ tests (ufs vs zfs) did have the same constraints, I see no
reason to distrust my results....

If you like to repeat the test and have a real disk to do the tests,
get the archives from freedb.org

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de		(uni)  
       schilling at fokus.fraunhofer.de	(work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Bryan Cantrill

2005-Nov-23 19:28 UTC

head link

[zfs-discuss] ZFS extremely slow

On Wed, Nov 23, 2005 at 07:56:38PM +0100, Joerg Schilling
wrote:> Bryan Cantrill <bmc at eng.sun.com> wrote:
> 
> > You _can''t_ be serious -- you''re mounting ON TOP OF
ANOTHER FILESYSTEM
> > and using that to draw performance conclusions?  Repeat your
experiment
> > on actual, physical spindles; no one is (or should) take this kind of
> > analysis seriously until then.  This is not to say that there
aren''t
> > issues here (the getdents() issue in particular has been/is being
> > addressed), just to say that one cannot _assume_ that performance
issues
> > found in this configuration are issues with ZFS.  Please repeat this
> > test on physical spindles...
> 
> As _both_ tests (ufs vs zfs) did have the same constraints, I see no
> reason to distrust my results....
The problem is that while it''s not _necessarily_ invalid, it''s
not
necessarily _valid_ either.  UFS is _not_ an accurate simulator of a
physical device:  it doesn''t have head latency or rotational latency
for
example.  As a result, systems that were designed to perform well with
respect to the physical device (e.g. ZFS) can look disproportionately bad,
while systems that did not have such a design center (e.g. UFS) can look
disproportionately good.  To take an extreme example of this:  if you
were to take a simple virtual device that simply took a real device and
scrambled its logical block numbers, ZFS would perform terribly -- almost
certainly worse than other systems.  And again, there may well be valid
results in your data -- but the potential presence of invalid results
discounts them, such as they are.

	- Bryan

--------------------------------------------------------------------------
Bryan Cantrill, Solaris Kernel Development.       http://blogs.sun.com/bmc

Casper.Dik at Sun.COM

2005-Nov-23 20:24 UTC

head link

[zfs-discuss] ZFS extremely slow

>I just did the followimg test:
>
>-	Create two 10 GB files on a UFS partition
>
>-	make them available with lofiadm
lofi doesn''t have stellar performance (certainly not
disk like) and could be hampering the test results because
of all the interactions.
>-	find is _extremely_ slow on ZFS
In the 27a build, both getdents and stat() are very slow;
a number of improvements went into b28/b29.>
Casper

Casper.Dik at Sun.COM

2005-Nov-23 20:30 UTC

head link

[zfs-discuss] ZFS extremely slow

>You _can''t_ be serious -- you''re mounting ON TOP OF
ANOTHER FILESYSTEM
>and using that to draw performance conclusions?  Repeat your experiment
>on actual, physical spindles; no one is (or should) take this kind of
>analysis seriously until then.  This is not to say that there
aren''t
>issues here (the getdents() issue in particular has been/is being
>addressed), just to say that one cannot _assume_ that performance issues
>found in this configuration are issues with ZFS.  Please repeat this
>test on physical spindles...
There are, indeed, many issues here.

E.g., if you don''t change ufs_WRITES to 0 you may be hit by ufs
write throttling; and if the ZFS working set is larger than the
ufs working set, it may be severaly penalized because of this.

Don''t forget that the UFS file the filesystem resides on is
cached; in place writing as UFS does will appear to cause
fewer writes as they don''t find their way to disk as they
would on a physical filesystem.

Casper

Joerg Schilling

2005-Nov-23 21:09 UTC

head link

[zfs-discuss] ZFS extremely slow

Casper.Dik at Sun.COM wrote:
>
> >I just did the followimg test:
> >
> >-	Create two 10 GB files on a UFS partition
> >
> >-	make them available with lofiadm
>
> lofi doesn''t have stellar performance (certainly not
> disk like) and could be hampering the test results because
> of all the interactions.
Please read my mail again: Both tests did use lofi

> >-	find is _extremely_ slow on ZFS
>
> In the 27a build, both getdents and stat() are very slow;
> a number of improvements went into b28/b29.
stat() is not a problem, but a single getdents() call takes 3 seconds.

It also seems to be important that getdents causes a lot of I/O in ZFS.

                  extended device statistics                      tty         
cpu
device       r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b  tin tout  us sy
wt id
cmdk0      149.6    0.2 5049.2    0.4  0.0  0.9    6.1   4  70    0  116   0 49 
0 50
lofi1       73.4    0.0 4696.7    0.0  0.0  0.7   10.0   1  73 

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de		(uni)  
       schilling at fokus.fraunhofer.de	(work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Casper.Dik at Sun.COM

2005-Nov-23 21:14 UTC

head link

[zfs-discuss] ZFS extremely slow

>Casper.Dik at Sun.COM wrote:
>
>>
>> >I just did the followimg test:
>> >
>> >-	Create two 10 GB files on a UFS partition
>> >
>> >-	make them available with lofiadm
>>
>> lofi doesn''t have stellar performance (certainly not
>> disk like) and could be hampering the test results because
>> of all the interactions.
>
>Please read my mail again: Both tests did use lofi
I understand that; but ufs and zfs behave differently and
where ufs may get away zfs may not.
>
>> >-	find is _extremely_ slow on ZFS
>>
>> In the 27a build, both getdents and stat() are very slow;
>> a number of improvements went into b28/b29.
>
>stat() is not a problem, but a single getdents() call takes 3 seconds.
>
>It also seems to be important that getdents causes a lot of I/O in ZFS.
>
>                  extended device statistics                      tty
>cpu
>device       r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b  tin tout  us
sy wt id
>cmdk0      149.6    0.2 5049.2    0.4  0.0  0.9    6.1   4  70    0  116   0
49  0 50
>lofi1       73.4    0.0 4696.7    0.0  0.0  0.7   10.0   1  73 
I believe it does inode prefetching which turned out not to be a good
idea in all cases.....

Was ufs_WRITES set to 0?  If not, then any difference in write working set
may have serious repercusions.

Casper

Joerg Schilling

2005-Nov-23 21:18 UTC

head link

[zfs-discuss] ZFS extremely slow

Casper.Dik at Sun.COM wrote:
> There are, indeed, many issues here.
>
> E.g., if you don''t change ufs_WRITES to 0 you may be hit by ufs
> write throttling; and if the ZFS working set is larger than the
> ufs working set, it may be severaly penalized because of this.
>
> Don''t forget that the UFS file the filesystem resides on is
> cached; in place writing as UFS does will appear to cause
> fewer writes as they don''t find their way to disk as they
> would on a physical filesystem.
Writer chaching should have been disables by the sticky bit.
As iostat shows that there are mainly reads, I asume that the problems
are zfs caused.

Note that I am using sfind and not sun find, so _all_ getdents() 
calls are done at once _followed_ by a stat() loop on the resulting list.

note that a sfind on the ufs test FS takes less than 3 minutes and 
that I currently estimate the sfind on the ZFS test FS to take aprox. 9 hours.

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de		(uni)  
       schilling at fokus.fraunhofer.de	(work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Joerg Schilling

2005-Nov-23 21:24 UTC

head link

[zfs-discuss] ZFS extremely slow

Casper.Dik at Sun.COM wrote:
> >stat() is not a problem, but a single getdents() call takes 3 seconds.
> >
> >It also seems to be important that getdents causes a lot of I/O in ZFS.
> >
> >                  extended device statistics                      tty
> >cpu
> >device       r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b  tin
tout  us sy wt id
> >cmdk0      149.6    0.2 5049.2    0.4  0.0  0.9    6.1   4  70    0 
116   0 49  0 50
> >lofi1       73.4    0.0 4696.7    0.0  0.0  0.7   10.0   1  73 
>
> I believe it does inode prefetching which turned out not to be a good
> idea in all cases.....
>
> Was ufs_WRITES set to 0?  If not, then any difference in write working set
> may have serious repercusions.
As expexted, setting ufs_WRITES to 0 did not change anything

truss -d -p  101937 
Base time stamp:  1132781010.2370  [ Wed Nov 23 22:23:30 CET 2005 ]
 3.7073 getdents64(4, 0xC6EA4000, 8192)                 = 8192
 7.3568 getdents64(4, 0xC6EA4000, 8192)                 = 8192
11.3254 getdents64(4, 0xC6EA4000, 8192)                 = 8192
15.7433 getdents64(4, 0xC6EA4000, 8192)                 = 8192
19.3317 getdents64(4, 0xC6EA4000, 8192)                 = 8192
23.1085 getdents64(4, 0xC6EA4000, 8192)                 = 8192
26.5186 getdents64(4, 0xC6EA4000, 8192)                 = 8192
30.2388 getdents64(4, 0xC6EA4000, 8192)                 = 8192


ufs_WRITES was set to 0 at the 11.3254 timestamp.

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de		(uni)  
       schilling at fokus.fraunhofer.de	(work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Joerg Schilling

2005-Nov-23 23:14 UTC

head link

[zfs-discuss] ZFS extremely slow

Bryan Cantrill <bmc at eng.sun.com> wrote:
> > As _both_ tests (ufs vs zfs) did have the same constraints, I see no
> > reason to distrust my results....
>
> The problem is that while it''s not _necessarily_ invalid,
it''s not
> necessarily _valid_ either.  UFS is _not_ an accurate simulator of a
> physical device:  it doesn''t have head latency or rotational
latency for
> example.  As a result, systems that were designed to perform well with
> respect to the physical device (e.g. ZFS) can look disproportionately bad,
> while systems that did not have such a design center (e.g. UFS) can look
I cannot speak for ZFS as I did not yet look into the sourcecode, but
UFS has been designed to perform well with physical devices.

But note: I am doing my tests because I like to find out whether it would
make sense to use ZFS on a SchilliX CD. CD drives behave substancial different
from hard disk drives. If you do the wrong things, you blow up the read cache
of the drive and seeks are extremely expensive on a CD.
> disproportionately good.  To take an extreme example of this:  if you
> were to take a simple virtual device that simply took a real device and
> scrambled its logical block numbers, ZFS would perform terribly -- almost
> certainly worse than other systems.  And again, there may well be valid
> results in your data -- but the potential presence of invalid results
> discounts them, such as they are.
As I observe extremely high reading media access rates with ZFS, I am sure that
the current estimation that ZFS is 200 times slower than UFS with getdents().

I am sure this ratio will never go to a 1 : 1 ratio on a different physical
background storage.

My observation is:

-	at the beginning of a directory, getdents() takes 0.1 .. 1 seconds

-	when the directory offset reaches some limit, a single getdents()
	reaches a constant value of 3 seconds.

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de		(uni)  
       schilling at fokus.fraunhofer.de	(work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Richard Elling - PAE

2005-Nov-24 01:12 UTC

head link

[zfs-discuss] ZFS extremely slow

Joerg Schilling wrote:> I cannot speak for ZFS as I did not yet look into the sourcecode, but
> UFS has been designed to perform well with physical devices.
Physical devices circa 1980  ;-)
  -- richard

Scott Howard

2005-Nov-24 01:50 UTC

head link

[zfs-discuss] ZFS extremely slow

On Wed, Nov 23, 2005 at 07:33:05PM +0100, Joerg Schilling
wrote:> I just did the followimg test:
> 
> -	Create two 10 GB files on a UFS partition
Right, so how is this testing ZFS?

How about with a slightly more realistic situation - filesystems directly
on disk! (Novel concept, I know....)

In all cases the hardware is a T3 disk array, with a 4-disk stripe in
hardware (no mirroring/RAID-5). For ZFS, this is then imported as a
single filesystem in the pool. For UFS, as a single filesystem, with
logging enabled.
In all cases the source file was stored in /tmp (tmpfs), and was dd''ed
to /dev/zero before starting so that any caching was the same between
each run. Both the ZFS and UFS filesystems were re-created between runs.

With the T3''s cache disabled :
ZFS - 17 minutes, 47 seconds real  (5:35 user, 0:11 system)
UFS - 48 minutes, 28 seconds real  (5:38 user, 0:13 system)

With the T3''s cache enabled :
ZFS - 15 minutes, 30 seconds real  (5:49 user, 0:13 system)
UFS - 24 minutes, 29 seconds real  (5:39 user, 0:13 system)

So realistically ZFS is _significantly_ faster than UFS (for the untar
at least). It also seems far less reliant on the speed of the underlying
disk (as is seen by the minimal difference the hardware cache makes).
ZFS layered on top of lofi layered on top of UFS may be slower, but
that''s not exactly the use case it was designed for!
> -	find is _extremely_ slow on ZFS
"find", or "sfind" ?

Running "find . | wc -l" (Solaris find). In both cases the filesystem
cache was cleared before running (zfs export/import, ufs unmount/mount).
ZFS - 3 minutes, 39 seconds  (0:07 user, 2:58 system)
UFS - 1 minute,  16 seconds  (0:06 user, 0:50 system)

As others have said there''s reasons this is slow, and it''s
being worked
on, but it''s certainly not a matter of it taking "hours".
This is either
something wrong with sfind, or an artifact of running ZFS on lofi on UFS.

Full results below.

  Scott

## T3 cache disabled. New ZFS and UFS filesystems created ##

# cd /zfs
# dd if=/tmp/freedb-complete-20051104.tar.bz2 of=/dev/null bs=1048576
>/dev/null
# ptime bzcat /tmp/freedb-complete-20051104.tar.bz2 | tar xf -

real    17:46.932
user     5:34.820
sys        10.825

# cd /ufs
# dd if=/tmp/freedb-complete-20051104.tar.bz2 of=/dev/null bs=1048576
>/dev/null
# ptime bzcat /tmp/freedb-complete-20051104.tar.bz2 | tar xf -

real    46:28.445
user     5:37.799
sys        12.962

## T3 cache enabled. New ZFS and UFS filesystems created ##

# cd /zfs
# dd if=/tmp/freedb-complete-20051104.tar.bz2 of=/dev/null bs=1048576
>/dev/null
# ptime bzcat /tmp/freedb-complete-20051104.tar.bz2 | tar xf -

real    15:29.661
user     5:49.150
sys        12.808

# cd /ufs
# dd if=/tmp/freedb-complete-20051104.tar.bz2 of=/dev/null bs=1048576
>/dev/null
# ptime bzcat /tmp/freedb-complete-20051104.tar.bz2 | tar xf -

real    24:29.378
user     5:38.632
sys        12.845

## T3 cache enabled. ZFS and UFS caches cleared ##

# cd /zfs
# ptime find . |wc -l

real     3:39.084
user        7.437
sys      2:57.592
 1872972

# cd /ufs
# ptime find . |wc -l

real     1:16.667
user        5.532
sys        49.787
 1872971

Torrey McMahon

2005-Nov-24 02:25 UTC

head link

[zfs-discuss] ZFS extremely slow

Richard Elling - PAE wrote:> Joerg Schilling wrote:
>> I cannot speak for ZFS as I did not yet look into the sourcecode, but
>> UFS has been designed to perform well with physical devices.
>
> Physical devices circa 1980  ;-)
>  -- richard

If you assumptions are older then your car ... :-)

Jason Ozolins

2005-Nov-24 04:57 UTC

head link

[zfs-discuss] ZFS extremely slow

Scott Howard wrote:> On Wed, Nov 23, 2005 at 07:33:05PM +0100, Joerg Schilling wrote:
> 
>>I just did the followimg test:
>>
>>-	Create two 10 GB files on a UFS partition
> 
> 
> Right, so how is this testing ZFS?
> 
> How about with a slightly more realistic situation - filesystems directly
> on disk! (Novel concept, I know....)
I was part way through this when I saw Scott''s email with some real
world
test results.  I thought I''d still post as I did things slightly
differently...

First try: Because Joerg''s tests were presumably hosted on a filesystem
that
was on a single physical device, I thought I''d try a one-slice ZFS pool
(/test), and a one-slice ufs filesystem (/a).  They are on the first 13GB of 
their respective, identical, disks, and the disks hanging off the same PCI 
SATA controller.  The disks may have write caching enabled, I suspect.  (Is 
there an easy way to tell from within Solaris?)    I tried star-1.4.3 and 
Solaris tar, sfind-1.0 and Solaris find.  Differences between sfind and find 
performance were not interesting (<10%), but star and tar behave very 
differently.

I got bored of extracting the whole database to ZFS, so I created a ~111,000 
file fragment of the whole freedb .tar.bz2 (basically, I hit ^C and 
tar/bzipped what I had extracted by that time. :-)   Even with the smaller 
dataset there are some marked differences between ZFS and UFS performance. 
Source .tar.bz2 file is only 22MB, so I didn''t worry about repeatable
caching
of the source - it''s down in the noise.  Filesystems were recreated in 
between extraction tests, and remounted between traversal tests.

Firstly, star and tar perform very, very differently across ZFS and UFS: [all 
times in seconds, hope the formatting is preserved]

	UFS	ZFS
star	70	1086
tar	402	28

For this extreme "create zillions of tiny files in two directories"
workload,
Joerg''s star extracts 38 times slower to ZFS than Solaris tar.  But it 
extracts almost 6 times faster to UFS than Solaris tar.  I got bored/puzzled 
during the tar extract to UFS, so I ran iostat in another window:

# iostat -xn 2
[first report omitted as it''s always silly]

                  extended device statistics
     r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
     0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c1d0
     0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c2d0
     0.0  247.7    0.0  331.4 16022.4  2.0 64686.5    8.1 100 100 c3d0
     0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4d0

Note the wait service time for c3d0. :-)  There was more of the same, with 
wsvc_t dropping slowly.  Clearly star and tar do things very very differently!

As for sfind versus find, the comparison is much less interesting:

	UFS	ZFS
sfind	1.43	3.16
find	1.41	3.10

Just as Scott found, ZFS as delivered in NV27a is definitely slower for this 
test, and the difference between sfind and find is negligible.  But
it''s not
pathologically slow like Joerg''s ZFS-over-lofi-over-UFS test setup
would suggest.

Cheers,
    Jason =:^)

ps to Joerg: imho, the only reason to ever extract that freedb tarfile on 
something other than a ReiserFS filesystem would be to ingest it into a 
database.  You wouldn''t design such a data organisation unless you were
assuming something like ReiserFS in any case.  Surely something less 
pathological (like, say, the Schillix live CD filesystem?) would be more 
appropriate as a performance test of ZFS?  Especially when that''s
apparently
what you were thinking of using ZFS for anyway...

Raw command output included below for completeness (filesystem fiddling not 
shown)

star extract to UFS:
$ (cd /a/jao ; time -p star xf /mnt/jao/freedb-piece.bz2 )
star: WARNING: Archive is bzip2 compressed, trying to use the -bz option.
star: 18635 blocks + 0 bytes (total of 190822400 bytes = 186350.00k).
real 70.04
user 15.80
sys 16.58

star extract to ZFS:
$ (cd /test/jao ; time -p star xf /mnt/jao/freedb-piece.bz2 )
star: WARNING: Archive is bzip2 compressed, trying to use the -bz option.
star: current ''./'' newer.
star: 18635 blocks + 0 bytes (total of 190822400 bytes = 186350.00k).
real 1086.52
user 15.19
sys 16.25

tar extract to UFS:
$ (cd /a/jao ; time -p bzcat  /mnt/jao/freedb-piece.bz2  | tar xf - )
real 402.70
user 16.23
sys 14.31

tar extract to ZFS:
$ (cd /test/jao ; time -p bzcat  /mnt/jao/freedb-piece.bz2  | tar xf - )
real 28.06
user 16.13
sys 9.01

sfind on UFS:
$ (cd /a/jao ; time -p sfind . | wc )
   110966  110966 1775424
real 1.43
user 0.06
sys 0.64

sfind on ZFS:
$ (cd /test/jao ; time -p sfind . | wc )
   110966  110966 1775424
real 3.16
user 0.07
sys 2.07

find on UFS:
$ (cd /a/jao ; time -p find . | wc )
   110966  110966 1775424
real 1.41
user 0.08
sys 0.65

find on ZFS:
$ (cd /test/jao ; time -p find . | wc )
   110966  110966 1775424
real 3.10
user 0.12
sys 2.00

Jake Maciejewski

2005-Nov-24 08:30 UTC

head link

[zfs-discuss] Re: ZFS extremely slow

I did some tests on my laptop (2.2GHz P4, 5400RPM ATA) running Nexenta with ZFS
on a partiton.

Extracting the tarball took 71m31.297s, "find . >/dev/null" took
85m8.983s, and possibly most alarming, "rm -rf *" took 270m13.031s.

It isn''t exactly a fair comparison but I tried the same with reiser4 on
my Gentoo desktop (2.4GHz amd64, 7200RPM ATA with 8Mb cache). The tarball
extracted in 30m15.974s, find took 1m49.597s, and rm which is normally a
weakness for reiser4 took 14m29.083s. Hardware differences might be responsible
for extracting twice as fast, but find is over 46 times as fast, and rm is over
18 times as fast on reiser4.

I''ll test build 29 on my amd64 machine for a fair comparison when
there''s a SchilliX or Nexenta build of it.
This message posted from opensolaris.org

Joerg Schilling

2005-Nov-24 11:42 UTC

head link

[zfs-discuss] ZFS extremely slow

Richard Elling - PAE <Richard.Elling at Sun.COM> wrote:
> Joerg Schilling wrote:
> > I cannot speak for ZFS as I did not yet look into the sourcecode, but
> > UFS has been designed to perform well with physical devices.
>
> Physical devices circa 1980  ;-)
For supporting devices from 1980, you need extra effort that could
be omitted today.

If there is a documentation on how ZFS does device optimization,
please send me a pointer. I will read and comment....

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de		(uni)  
       schilling at fokus.fraunhofer.de	(work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Joerg Schilling

2005-Nov-24 12:37 UTC

head link

[zfs-discuss] ZFS extremely slow

Jason Ozolins <Jason.Ozolins at anu.edu.au> wrote:
> First try: Because Joerg''s tests were presumably hosted on a
filesystem that
> was on a single physical device, I thought I''d try a one-slice ZFS
pool
> (/test), and a one-slice ufs filesystem (/a).  They are on the first 13GB
of
> their respective, identical, disks, and the disks hanging off the same PCI 
> SATA controller.  The disks may have write caching enabled, I suspect.  (Is
> there an easy way to tell from within Solaris?)    I tried star-1.4.3 and 
star-1.4.3 is _extremely_ old. I would recommend to use star-1.5a70.

> Solaris tar, sfind-1.0 and Solaris find.  Differences between sfind and
find
> performance were not interesting (<10%), but star and tar behave very 
> differently.
sun find is done the ancient way, while sfind used recent technologies
to operate. Sfind is typicalls 10% faster than Sun find. 

Sun find spends aprox. less user CPU time than sfind.
This is caused by the fact that treewalk() in sfind calls the 
walk-callback-funktion that does some checks and then calls the pure find 
expression interpreter. The callback function in Sun find directly calls one big
and unstructured function...making it impossible to use create a find library
out of Sun find as done with sfind for star.

However, as sfind reads all directories in one big chunk _before_ starting
to work on the list (as all modern software does), the tree-walker from sfind
is faster than nftw() used by Sun find. This causes sfind to need less
system CPU time than Sun find and this is why sfind is a bit faster than 
Sun find. 

The differences between Sun tar and star are caused by the fact that star
operates in a secure and comprehensible way while Sun tar cannot even tell
whether there have been specific problems during the extract operation.

In order to tell whether star was able to extract a file correctly, it calls
fsync(f) and close(f) for every file and checks for the return codes.
This causes a 10-20% performance penalty on UFS (depending on file sizes).

If you like to find the performance penalty caused by the fact that star
is able to tell you whether it did work correctly, compare the time that
you need to run:

	star -xp f=xxx
and
	star -cp f=xxx -no-fsync

BTW: On Linux and ext2, I did see a 400% performance penalty when running
in secure (default) fsync mode. This is caused by the fact that ext2+Linux
does start disk transfers very late and needs a long time to finally sync
the FS cache to the disk.

> I got bored of extracting the whole database to ZFS, so I created a
~111,000
> file fragment of the whole freedb .tar.bz2 (basically, I hit ^C and 
> tar/bzipped what I had extracted by that time. :-)   Even with the smaller 
> dataset there are some marked differences between ZFS and UFS performance. 
> Source .tar.bz2 file is only 22MB, so I didn''t worry about
repeatable caching
> of the source - it''s down in the noise.  Filesystems were
recreated in
> between extraction tests, and remounted between traversal tests.
>
> Firstly, star and tar perform very, very differently across ZFS and UFS:
[all
> times in seconds, hope the formatting is preserved]
>
> 	UFS	ZFS
> star	70	1086
> tar	402	28
This is _really_ interesting!

But note first: you cannot compare these times without checking the state
of the FS at the time when the untar operation did finish. You should at
least run two sync(1) calls after the tar extract and include the sync time
to the time of the untar operation. Otherwise you did just compare times for
two completely unknown and differeent tasks.

If you like to compare ZFS vs. UFS, I recommend to run 4 tests:

star -x UFS, star -x -no-fsync UFS, star -xZUFS, star -x -no-fsync ZFS

If you like to compare Sun tar vs, star, you need to understand what
both programs do. A Sun tar test without syncing at the end of the
operation is worthless.

[snipped part]

> As for sfind versus find, the comparison is much less interesting:
>
> 	UFS	ZFS
> sfind	1.43	3.16
> find	1.41	3.10
My tests look different (currently I may only give UFS results for a real
disk device). This is the whole FreeDB tree:

sfind . > /dev/null
2:11.517r 3.590u 69.330s 55% 0M 0+0k 0st 0+0io 0pf+0w

find . > /dev/null
2:17.228r 4.680u 73.300s 56% 0M 0+0k 0st 0+0io 0pf+0w

This shows:

Sfind in this case needs 30% less USER CPU time than Sun find.
Sfind in this case needs 6% less SYSTEM CPU time than Sun find.

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de		(uni)  
       schilling at fokus.fraunhofer.de	(work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Joerg Schilling

2005-Nov-24 13:14 UTC

head link

[zfs-discuss] ZFS extremely slow

Scott Howard <Scott.Howard at Sun.COM> wrote:
> > -	Create two 10 GB files on a UFS partition
>
> Right, so how is this testing ZFS?
The way I did describe....

It shows e.g. that ZFS does a lot more I/O than UFS.
> How about with a slightly more realistic situation - filesystems directly
> on disk! (Novel concept, I know....)
???? 
> In all cases the hardware is a T3 disk array, with a 4-disk stripe in
> hardware (no mirroring/RAID-5). For ZFS, this is then imported as a
> single filesystem in the pool. For UFS, as a single filesystem, with
> logging enabled.
> In all cases the source file was stored in /tmp (tmpfs), and was
dd''ed
> to /dev/zero before starting so that any caching was the same between
> each run. Both the ZFS and UFS filesystems were re-created between runs.
>
> With the T3''s cache disabled :
> ZFS - 17 minutes, 47 seconds real  (5:35 user, 0:11 system)
> UFS - 48 minutes, 28 seconds real  (5:38 user, 0:13 system)
>
> With the T3''s cache enabled :
> ZFS - 15 minutes, 30 seconds real  (5:49 user, 0:13 system)
> UFS - 24 minutes, 29 seconds real  (5:39 user, 0:13 system)
I am not sure whether this could be called realistic.
It uses extremely expensive hardware.
> > -	find is _extremely_ slow on ZFS
>
> "find", or "sfind" ?

Of course, I use sfind because Sun find is using an ancient method of
looping over getdents() ... stat() ... getdents() on the same directory
while sfind uses the modern and POSIX 2001 compliant aproach of first 
reading in the whole list of names from a directory and then looping
over the names.

> Running "find . | wc -l" (Solaris find). In both cases the
filesystem
> cache was cleared before running (zfs export/import, ufs unmount/mount).
> ZFS - 3 minutes, 39 seconds  (0:07 user, 2:58 system)
> UFS - 1 minute,  16 seconds  (0:06 user, 0:50 system)
So the only thing you could prove with this text is that a find
on ZFS is sill slower on extremely expensive HW than on emulated
chesk HW using UFS.


I am currently running a test on real hardware but my impression is that
this will not give significant different results than my first test.

I''ll report later

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de		(uni)  
       schilling at fokus.fraunhofer.de	(work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Jason Ozolins

2005-Nov-25 11:40 UTC

head link

[zfs-discuss] ZFS extremely slow

Joerg Schilling wrote:> Jason Ozolins <Jason.Ozolins at anu.edu.au> wrote:
> star-1.4.3 is _extremely_ old. I would recommend to use star-1.5a70.
I built 1.5.a70, and I don''t think it made much difference to the
timings.
It is what I have used for the tests below.
> In order to tell whether star was able to extract a file correctly, it
calls
> fsync(f) and close(f) for every file and checks for the return codes.
> This causes a 10-20% performance penalty on UFS (depending on file sizes).
And obviously much more on ZFS...
>>Firstly, star and tar perform very, very differently across ZFS and UFS:
[all
>>times in seconds, hope the formatting is preserved]
>>
>>	UFS	ZFS
>>star	70	1086
>>tar	402	28
> 
> 
> This is _really_ interesting!
> 
> But note first: you cannot compare these times without checking the state
> of the FS at the time when the untar operation did finish. You should at
> least run two sync(1) calls after the tar extract and include the sync time
> to the time of the untar operation. Otherwise you did just compare times
for
> two completely unknown and differeent tasks.
> 
> If you like to compare ZFS vs. UFS, I recommend to run 4 tests:
> 
> star -x UFS, star -x -no-fsync UFS, star -xZUFS, star -x -no-fsync ZFS
These new timings include both the extraction and a umount/zfs export to 
totally ensure all data made it to the disk.  When the filesystems are idle, 
the umount or export is basically instant, so I''ve assumed that any
time
taken by the umount/export in this case is just to push data out to disk.

For my ~ 111,000 file fragment of the freedb archive:

                  UFS      ZFS
star              72     1080
tar              420       33
star-no-fsync     57       32

Times rounded to the nearest second because I make no representation of their 
exact repeatability anyway.  The umount/export made some difference, but not 
a whole heap.

Strangeness did happen once though: the tar extract to UFS ran > 3 times 
faster...  From two runs:

timing tar extract to /ufs...
real 133.29
user 16.33
sys 12.69

timing tar extract to /ufs...
real 419.92
user 16.29
sys 12.76

Note that the user time and system time are essentially equal across the two 
runs.  The filesystem is recreated before each extraction, so data placement 
can''t be part of the effect.  iostat on one of the slow runs showed the
tar
extraction provoking extremely long ( > 30,000 ) pending I/O queues. 
Unfortunately I didn''t have iostat going during the fast run. :-(  Any
ideas,
Sun folks?

Anyway, looks like (for this extreme case of zillions of tiny files) fsync is 
the cause of the huge blowout in extraction time.

Given that fsync is the issue, I just extracted the whole database to my poky 
6GB one-slice ZFS volume with star -no-fsync:

real 564.54
user 244.09
sys 181.85

And I just ran find over the whole thing after export/import of the pool:
# time -p find . | wc
  1872970 1872970 31616653
real 338.16
user 2.21
sys 53.77

Not as fast as Scott Howard''s test, but then again it''s a
single consumer disk.

Sfind is indeed faster than find on the whole freedb archive:
# (cd /zfs/freedb ; time -p sfind . | wc)
  1872970 1872970 31616653
real 246.00
user 1.34
sys 55.52

Certainly ZFS doesn''t seem to be pathologically slow on this real world
disk.
    I''m now extracting with star -no-fsync to my UFS volume, and by
contrast
the thing is taking lots longer, and the filesystem''s gone completely 
unresponsive (huge I/O queue again, ls -al of the filesystem root is stalled 
in an uninterruptible system call for many seconds on end when I hit ^C). 
And star is actually stuck in something quite noninterruptible right now too, 
based on what happens when I try to truss it.  So I''m kind of happy
with ZFS
right now.  :-)

-Jason (going home, not waiting for the UFS extract to complete).

ps: I still think that this is a totally crazy workload/data organisation to 
give to any filesystem other than ReiserFS.  Do many users really put > 
100,000 files in a directory?

Frank Hofmann - Solaris Sustaining

2005-Nov-25 12:09 UTC

head link

[zfs-discuss] ZFS extremely slow

[ ... ]> ps: I still think that this is a totally crazy workload/data organisation
to
> give to any filesystem other than ReiserFS.  Do many users really put > 
> 100,000 files in a directory?
You''ve never worked in Services for any of the major system vendors,
have you ?

;-)

Complaints about "bad performance" when doing this are _really_
frequent.

We tend to get around ten requests a year about ufs'' limit of 32765
subdirectories. And more about (bad) performance with "manymany" files
in one directory.

I agree with you in the sense that I don''t understand _why_ people
want to do this (and I wouldn''t). But I recognize that they are.

Whatever reasons they have, if we fail to do this well then we loose
standing. Hence the test is a valid one.

Best regards,
FrankH.

Joerg Schilling

2005-Nov-25 14:51 UTC

head link

[zfs-discuss] ZFS extremely slow

Jason Ozolins <Jason.Ozolins at anu.edu.au> wrote:
> > star-1.4.3 is _extremely_ old. I would recommend to use star-1.5a70.
>
> I built 1.5.a70, and I don''t think it made much difference to the
timings.
> It is what I have used for the tests below.
Parts did become faster and other parts did become slower.

As recent star versions did include validity checking for correctness of file
meta data, it in theory should be slower than implementations like Sun tar that
does not care.

> > In order to tell whether star was able to extract a file correctly, it
calls
> > fsync(f) and close(f) for every file and checks for the return codes.
> > This causes a 10-20% performance penalty on UFS (depending on file
sizes).
>
> And obviously much more on ZFS...
This is something that shouild been worked on!
Using fsync() should not cause a performance penalty of 34x,
this leads to deficits in the buffer/cache implementation.

> > If you like to compare ZFS vs. UFS, I recommend to run 4 tests:
> > 
> > star -x UFS, star -x -no-fsync UFS, star -xZUFS, star -x -no-fsync ZFS
>
> These new timings include both the extraction and a umount/zfs export to 
> totally ensure all data made it to the disk.  When the filesystems are
idle,
> the umount or export is basically instant, so I''ve assumed that
any time
> taken by the umount/export in this case is just to push data out to disk.
>
> For my ~ 111,000 file fragment of the freedb archive:
>
>                   UFS      ZFS
> star              72     1080
> tar              420       33
> star-no-fsync     57       32
>
> Times rounded to the nearest second because I make no representation of
their
> exact repeatability anyway.  The umount/export made some difference, but
not
> a whole heap.
Interesting to see that star is still a bit faster than Sun tar bacause
the speed optimization in star have all been made for the create mode and not 
for the extract mode.

> Strangeness did happen once though: the tar extract to UFS ran > 3 times
> faster...  From two runs:
>
> timing tar extract to /ufs...
> real 133.29
> user 16.33
> sys 12.69
>
> timing tar extract to /ufs...
> real 419.92
> user 16.29
> sys 12.76
This is really strange.
The user/system timing are identical.

> Anyway, looks like (for this extreme case of zillions of tiny files) fsync
is
> the cause of the huge blowout in extraction time.
And this is why I am really reverent to the UFS implentation....

> Given that fsync is the issue, I just extracted the whole database to my
poky
> 6GB one-slice ZFS volume with star -no-fsync:
>
> real 564.54
> user 244.09
> sys 181.85
The whole 1.9 million files? Could you name numbers for yout machine?
RAM size, CPU type/speed, Disk type/speed?

> And I just ran find over the whole thing after export/import of the pool:
> # time -p find . | wc
>   1872970 1872970 31616653
> real 338.16
> user 2.21
> sys 53.77
>
> Not as fast as Scott Howard''s test, but then again it''s a
single consumer disk.
>
> Sfind is indeed faster than find on the whole freedb archive:
> # (cd /zfs/freedb ; time -p sfind . | wc)
>   1872970 1872970 31616653
> real 246.00
> user 1.34
> sys 55.52
If this is both ZFS, I would like to know how much RAM you were using.
It seems that ZFS does not behave nicely on internal ZFS meta data cache misses.

> Certainly ZFS doesn''t seem to be pathologically slow on this real
world disk.
It is for me... Tested on a machine with 1280 Megabytes of RAM.

>     I''m now extracting with star -no-fsync to my UFS volume, and
by contrast
> the thing is taking lots longer, and the filesystem''s gone
completely
> unresponsive (huge I/O queue again, ls -al of the filesystem root is
stalled
> in an uninterruptible system call for many seconds on end when I hit ^C). 
This is something I also noted yesterday. It seems that otherwise the buffer
cache
gets full and makes the system sitcky
> And star is actually stuck in something quite noninterruptible right now
too,
> based on what happens when I try to truss it.  So I''m kind of
happy with ZFS
> right now.  :-)
Did it stick forever?

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de		(uni)  
       schilling at fokus.fraunhofer.de	(work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Jason Ozolins

2005-Nov-28 01:09 UTC

head link

[zfs-discuss] ZFS extremely slow

Joerg Schilling wrote:> Jason Ozolins <Jason.Ozolins at anu.edu.au> wrote:
>>>In order to tell whether star was able to extract a file correctly,
it calls
>>>fsync(f) and close(f) for every file and checks for the return
codes.
>>>This causes a 10-20% performance penalty on UFS (depending on file
sizes).
>>
>>And obviously much more on ZFS...
> 
> This is something that shouild been worked on!
> Using fsync() should not cause a performance penalty of 34x,
> this leads to deficits in the buffer/cache implementation.
The 34x penalty is in an extreme case of extracting lots of very small files. 
  For larger files, the latency enforced by the fsync would not be so apparent.

I''m trying to think of the use cases where fsync is valuable for tar
extraction:
1. the system dies during the extraction, and you were relying on verbose 
output to tell you exactly what''s been extracted, and that output is 
preserved for you to refer to
2. the system dies shortly after the extraction completes and you have taken 
the completion of the command as a signal that all I/O associated with the 
command is complete
3. you want to detect any i/o errors resulting from the extraction
4. ?

Case 1 seems pretty marginal to me.

I''d really like a mechanism to enforce case 2 at the user level though,
like
"please synchronously checkpoint all my outstanding I/O".  Sync does
too much
(an ordinary user shouldn''t be able to mess with I/O policy for the
whole
machine) and too little (you can''t tell whether the data''s
really made it to
stable storage when the sync system call returns).   This would be good for 
scripting where you don''t have control over the behaviour of individual
programs, like Solaris tar for instance.

Case 3 is problematic anyway, because in a logging filesystem the data might 
be written to a temporary log and not to its eventual destination at the time 
the command completes.  The fact that it has reached stable storage is not 
enough to guarantee that nothing else can go wrong... :-)
>>Strangeness did happen once though: the tar extract to UFS ran > 3
times
>>faster...  From two runs:
>>
>>timing tar extract to /ufs...
>>real 133.29
>>user 16.33
>>sys 12.69
>>
>>timing tar extract to /ufs...
>>real 419.92
>>user 16.29
>>sys 12.76
> 
> 
> This is really strange.
> The user/system timing are identical.
Indeed.  My vague guess is that there''s a kernel thread that
periodically
flushes dirty pages (buffers? showing my ignorance here) to disk, and that 
the start time of the extraction has to be in a certain phase range of that 
kernel thread''s cycle for the extraction to happen quickly.  The /ufs
file
system was mounted with logging enabled, BTW.
>>Given that fsync is the issue, I just extracted the whole database to my
poky
>>6GB one-slice ZFS volume with star -no-fsync:
>>
>>real 564.54
>>user 244.09
>>sys 181.85
> 
> 
> The whole 1.9 million files? Could you name numbers for yout machine?
> RAM size, CPU type/speed, Disk type/speed?
Yes, the whole lot, as reported by the find/sfind runs below.  It''s a
pretty
boring machine by modern desktop standards, except for the number of disks:
Athlon 64 3200 (socket 754, 2GHz, 1MB L2 cache)
Asus K8N-E motherboard, nForce3-250 chipset, 1GB of RAM
4 * Seagate 160GB 7200RPM SATA disks
   - 2 on nForce SATA controller
   - 2 on Silicon Image 3114 SATA controller
   - disks are NCQ capable, but the controllers aren''t

Both the /ufs and /zfs test filesystems were single slices from the start of 
the disks attached to the Silicon Image controller.

bzcat may like having 1MB of L2 cache:

-bash-3.00# time -p bzcat ~jao900/freedb-complete-20051104.tar.bz2 >
/dev/null
real 226.55
user 219.49
sys 0.94

So the tar extraction part actually takes 338 seconds real time.  (my star 
extraction timing listed above is for the whole pipeline).
>>And I just ran find over the whole thing after export/import of the
pool:
>># time -p find . | wc
>>  1872970 1872970 31616653
>>real 338.16
>>user 2.21
>>sys 53.77
>>
>>Not as fast as Scott Howard''s test, but then again
it''s a single consumer disk.
>>
>>Sfind is indeed faster than find on the whole freedb archive:
>># (cd /zfs/freedb ; time -p sfind . | wc)
>>  1872970 1872970 31616653
>>real 246.00
>>user 1.34
>>sys 55.52
> 
> If this is both ZFS, I would like to know how much RAM you were using.
> It seems that ZFS does not behave nicely on internal ZFS meta data cache
misses.
1GB of RAM, no tweaks to any memory management policy tunables that might 
exist.  The filesystem were exported/imported before each run of find/sfind, 
so the cache was clear of any ZFS metadata.
>>Certainly ZFS doesn''t seem to be pathologically slow on this
real world disk.
> 
> It is for me... Tested on a machine with 1280 Megabytes of RAM.
Now that''s odd.  You''ve got more RAM than me... I
wasn''t running anything
except a couple of SSH sessions and the dtlogin window on the test machine at 
the time, so it should have had maybe 850-900MB of available memory.
>>    I''m now extracting with star -no-fsync to my UFS volume,
and by contrast
>>the thing is taking lots longer, and the filesystem''s gone
completely
>>unresponsive (huge I/O queue again, ls -al of the filesystem root is
stalled
>>in an uninterruptible system call for many seconds on end when I hit
^C).
> 
> This is something I also noted yesterday. It seems that otherwise the
buffer cache
> gets full and makes the system sitcky
> 
>>And star is actually stuck in something quite noninterruptible right now
too,
>>based on what happens when I try to truss it.  So I''m kind of
happy with ZFS
>>right now.  :-)
> 
> Did it stick forever?
No.  I came back this morning to find that my 6GB partition didn''t have
enough inodes (d''oh!).  I''ll try again tonight when I
don''t need the machine
to be responsive.

-Jason

Eric Schrock

2005-Nov-28 01:48 UTC

head link

[zfs-discuss] ZFS extremely slow

If I remember correctly, sync(1) and fsync(3c) actually work as
advertised on ZFS, and guarantee that data has been written to disk
before the syscall returns.  Thus, there will be more overhead than on
UFS (where the pages are flushed, but may or may not have reached stable
storage).

I guess the question (as Jason points out) is what are you trying to
accomplish?  If you really want the data for each file on disk between
each extraction, then you are getting what you asked for on ZFS, unlike
UFS.  I don''t see any value to the (broken) UFS semantics.

That being said, there is always performance to be gained.  But I
wouldn''t expect a 34x improvement any time soon.

- Eric

On Mon, Nov 28, 2005 at 12:09:16PM +1100, Jason Ozolins
wrote:> Joerg Schilling wrote:
>
> >This is something that shouild been worked on!
> >Using fsync() should not cause a performance penalty of 34x,
> >this leads to deficits in the buffer/cache implementation.
> 
> The 34x penalty is in an extreme case of extracting lots of very small 
> files. For larger files, the latency enforced by the fsync would not be so 
>  apparent.
> 
> I''m trying to think of the use cases where fsync is valuable for
tar
> extraction:
> 1. the system dies during the extraction, and you were relying on verbose 
> output to tell you exactly what''s been extracted, and that output
is
> preserved for you to refer to
> 2. the system dies shortly after the extraction completes and you have 
> taken the completion of the command as a signal that all I/O associated 
> with the command is complete
> 3. you want to detect any i/o errors resulting from the extraction
> 4. ?
> 
> Case 1 seems pretty marginal to me.
> 
> I''d really like a mechanism to enforce case 2 at the user level
though,
> like "please synchronously checkpoint all my outstanding I/O". 
Sync does
> too much (an ordinary user shouldn''t be able to mess with I/O
policy for
> the whole machine) and too little (you can''t tell whether the
data''s really
> made it to stable storage when the sync system call returns).   This would 
> be good for scripting where you don''t have control over the
behaviour of
> individual programs, like Solaris tar for instance.
> 
> Case 3 is problematic anyway, because in a logging filesystem the data 
> might be written to a temporary log and not to its eventual destination at 
> the time the command completes.  The fact that it has reached stable 
> storage is not enough to guarantee that nothing else can go wrong... :-)
--
Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/eschrock

Neil Perrin

2005-Nov-28 04:28 UTC

head link

[zfs-discuss] ZFS extremely slow

Eric is right that sync() for ZFS guarantees all outstanding
data and meta data has been written to stable storage. UFS
has indeed only ever scheduled the IO but not waited for it''s
completion. However, fsync() has always been synchronous for
UFS (and ZFS). So I''m concerned about the severe performance
degradation noticed. I haven''t seen that in my own benchmarking
of fsync(). I''ll try to reproduce this.

Neil.

Eric Schrock wrote On 11/27/05 18:48,:> If I remember correctly, sync(1) and fsync(3c) actually work as
> advertised on ZFS, and guarantee that data has been written to disk
> before the syscall returns.  Thus, there will be more overhead than on
> UFS (where the pages are flushed, but may or may not have reached stable
> storage).
> 
> I guess the question (as Jason points out) is what are you trying to
> accomplish?  If you really want the data for each file on disk between
> each extraction, then you are getting what you asked for on ZFS, unlike
> UFS.  I don''t see any value to the (broken) UFS semantics.
> 
> That being said, there is always performance to be gained.  But I
> wouldn''t expect a 34x improvement any time soon.
> 
> - Eric
> 
> On Mon, Nov 28, 2005 at 12:09:16PM +1100, Jason Ozolins wrote:
> 
>>Joerg Schilling wrote:
>>
>>
>>>This is something that shouild been worked on!
>>>Using fsync() should not cause a performance penalty of 34x,
>>>this leads to deficits in the buffer/cache implementation.
>>
>>The 34x penalty is in an extreme case of extracting lots of very small 
>>files. For larger files, the latency enforced by the fsync would not be
so
>> apparent.
>>
>>I''m trying to think of the use cases where fsync is valuable
for tar
>>extraction:
>>1. the system dies during the extraction, and you were relying on
verbose
>>output to tell you exactly what''s been extracted, and that
output is
>>preserved for you to refer to
>>2. the system dies shortly after the extraction completes and you have 
>>taken the completion of the command as a signal that all I/O associated 
>>with the command is complete
>>3. you want to detect any i/o errors resulting from the extraction
>>4. ?
>>
>>Case 1 seems pretty marginal to me.
>>
>>I''d really like a mechanism to enforce case 2 at the user level
though,
>>like "please synchronously checkpoint all my outstanding I/O".
Sync does
>>too much (an ordinary user shouldn''t be able to mess with I/O
policy for
>>the whole machine) and too little (you can''t tell whether the
data''s really
>>made it to stable storage when the sync system call returns).   This
would
>>be good for scripting where you don''t have control over the
behaviour of
>>individual programs, like Solaris tar for instance.
>>
>>Case 3 is problematic anyway, because in a logging filesystem the data 
>>might be written to a temporary log and not to its eventual destination
at
>>the time the command completes.  The fact that it has reached stable 
>>storage is not enough to guarantee that nothing else can go wrong... :-)
> 
> 
> --
> Eric Schrock, Solaris Kernel Development      
http://blogs.sun.com/eschrock
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://opensolaris.org/mailman/listinfo/zfs-discuss
-- 

Neil

Eric Schrock

2005-Nov-28 05:43 UTC

head link

[zfs-discuss] ZFS extremely slow

On Sun, Nov 27, 2005 at 09:28:30PM -0700, Neil Perrin
wrote:> Eric is right that sync() for ZFS guarantees all outstanding
> data and meta data has been written to stable storage. UFS
> has indeed only ever scheduled the IO but not waited for it''s
> completion. However, fsync() has always been synchronous for
> UFS (and ZFS). So I''m concerned about the severe performance
> degradation noticed. I haven''t seen that in my own benchmarking
> of fsync(). I''ll try to reproduce this.
> 
> Neil.
Obviously I didn''t remember quite right ;-)  What''s an extra
''f'',
anyway?  Thanks for the clarification.

- Eric

--
Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/eschrock

Casper.Dik at Sun.COM

2005-Nov-28 09:19 UTC

head link

[zfs-discuss] ZFS extremely slow

>If I remember correctly, sync(1) and fsync(3c) actually work as
>advertised on ZFS, and guarantee that data has been written to disk
>before the syscall returns.  Thus, there will be more overhead than on
>UFS (where the pages are flushed, but may or may not have reached stable
>storage).
>
>I guess the question (as Jason points out) is what are you trying to
>accomplish?  If you really want the data for each file on disk between
>each extraction, then you are getting what you asked for on ZFS, unlike
>UFS.  I don''t see any value to the (broken) UFS semantics.
>
>That being said, there is always performance to be gained.  But I
>wouldn''t expect a 34x improvement any time soon.
fsync() does guarantee that it won''t return until the
data is on stable storage, unlink sync().

However, UFS fsync is less than useful because it doesn''t guarantee
that the inode and the directory entry and the directory inode have
reached stable storage.

Not sure what zfs guarantees?  Does it guarantee that the ueberblock
which points to the newly created file and its data is on disk?

Casper

Neil Perrin

2005-Nov-28 16:06 UTC

head link

[zfs-discuss] ZFS extremely slow

Casper.Dik at sun.com wrote On 11/28/05 02:19,:> 
> fsync() does guarantee that it won''t return until the
> data is on stable storage, unlink sync().
> 
> However, UFS fsync is less than useful because it doesn''t
guarantee
> that the inode and the directory entry and the directory inode have
> reached stable storage.
With UFS logging (now the default) the directory entry and inode are
on stable storage (in the log) by return from the fsync.
This may not have been so with non logging.
> 
> Not sure what zfs guarantees?  Does it guarantee that the ueberblock
> which points to the newly created file and its data is on disk?
ZFS will push all data and meta data to the intent log by return
from fsync(). The uber block will not be updated until the DMU
transaction group commits - potentially seconds later.
However, on power loss or panic, the intent log is replayed which
will recreate the transactions and flush them by forcing the
DMU transaction group to commit. So the uber block will contain
the tree with the newly created file.

Neil

Joerg Schilling

2005-Nov-28 16:33 UTC

head link

[zfs-discuss] ZFS extremely slow

Jason Ozolins <Jason.Ozolins at anu.edu.au> wrote:
> > This is something that shouild been worked on!
> > Using fsync() should not cause a performance penalty of 34x,
> > this leads to deficits in the buffer/cache implementation.
>
> The 34x penalty is in an extreme case of extracting lots of very small
files.
>   For larger files, the latency enforced by the fsync would not be so
apparent.
I get the following numbers:
			UFS		ZFS
star -x			1:55:20		14:55:20
star -x -no-fsync	1:57:16		3:57:08

So comparing the fsync() case for UFS vs. ZFS, I get a factor of ~ 7.8x

This is something that could be worked on.

rm -rf is taking 27 minutes on UFS and 5 hours on ZFS. This is something
that needs to be worked on.

> I''m trying to think of the use cases where fsync is valuable for
tar extraction:
> 1. the system dies during the extraction, and you were relying on verbose 
> output to tell you exactly what''s been extracted, and that output
is
> preserved for you to refer to
> 2. the system dies shortly after the extraction completes and you have
taken
> the completion of the command as a signal that all I/O associated with the 
> command is complete
> 3. you want to detect any i/o errors resulting from the extraction
> 4. ?
I like usrs of star to be able to evaluate the exit code and take exit code 0
as a signal that extraction was definitely OK.

> I''d really like a mechanism to enforce case 2 at the user level
though, like
> "please synchronously checkpoint all my outstanding I/O".  Sync
does too much
> (an ordinary user shouldn''t be able to mess with I/O policy for
the whole
> machine) and too little (you can''t tell whether the
data''s really made it to
> stable storage when the sync system call returns).   This would be good for
> scripting where you don''t have control over the behaviour of
individual
> programs, like Solaris tar for instance.
The problem is that you cannot ask the system to report whether the whole
star -x run was successful at the end by calling something magical. The only
way is the ask this for every file and this could only be done via fsync()
calls at the end of the extraction of every file.

> Case 3 is problematic anyway, because in a logging filesystem the data
might
> be written to a temporary log and not to its eventual destination at the
time
> the command completes.  The fact that it has reached stable storage is not 
> enough to guarantee that nothing else can go wrong... :-)
In this case, the logging system would need to signal an exception and keep
the log data.


> > The whole 1.9 million files? Could you name numbers for yout machine?
> > RAM size, CPU type/speed, Disk type/speed?
>
> Yes, the whole lot, as reported by the find/sfind runs below. 
It''s a pretty
> boring machine by modern desktop standards, except for the number of disks:
> Athlon 64 3200 (socket 754, 2GHz, 1MB L2 cache)
> Asus K8N-E motherboard, nForce3-250 chipset, 1GB of RAM
> 4 * Seagate 160GB 7200RPM SATA disks
>    - 2 on nForce SATA controller
>    - 2 on Silicon Image 3114 SATA controller
>    - disks are NCQ capable, but the controllers aren''t
So it is not a really fast machine with plenty of RAM.
> -bash-3.00# time -p bzcat ~jao900/freedb-complete-20051104.tar.bz2 >
/dev/null
> real 226.55
> user 219.49
> sys 0.94
>
> So the tar extraction part actually takes 338 seconds real time.  (my star 
> extraction timing listed above is for the whole pipeline).
Did you susbtract numbers?

Star always runs in two processes (a background FIFO process and a foreground
extract process). In case of decompression there is another zip process that
may run simultanerously during e.g. the I/O wait times.

> >>And I just ran find over the whole thing after export/import of the
pool:
> >># time -p find . | wc
> >>  1872970 1872970 31616653
> >>real 338.16
> >>user 2.21
> >>sys 53.77
> >>
> >>Not as fast as Scott Howard''s test, but then again
it''s a single consumer disk.
> >>
> >>Sfind is indeed faster than find on the whole freedb archive:
> >># (cd /zfs/freedb ; time -p sfind . | wc)
> >>  1872970 1872970 31616653
> >>real 246.00
> >>user 1.34
> >>sys 55.52
> > 
> > If this is both ZFS, I would like to know how much RAM you were using.
> > It seems that ZFS does not behave nicely on internal ZFS meta data
cache misses.
>
> 1GB of RAM, no tweaks to any memory management policy tunables that might 
> exist.  The filesystem were exported/imported before each run of
find/sfind,
> so the cache was clear of any ZFS metadata.
It is unclear to me why I see the oposite behavior.

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de		(uni)  
       schilling at fokus.fraunhofer.de	(work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Joerg Schilling

2005-Nov-28 16:38 UTC

head link

[zfs-discuss] ZFS extremely slow

Eric Schrock <eric.schrock at sun.com> wrote:
> If I remember correctly, sync(1) and fsync(3c) actually work as
> advertised on ZFS, and guarantee that data has been written to disk
> before the syscall returns.  Thus, there will be more overhead than on
> UFS (where the pages are flushed, but may or may not have reached stable
> storage).
Could you explian what you understand by stable storage?
> I guess the question (as Jason points out) is what are you trying to
> accomplish?  If you really want the data for each file on disk between
> each extraction, then you are getting what you asked for on ZFS, unlike
> UFS.  I don''t see any value to the (broken) UFS semantics.
What does UFS?

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de		(uni)  
       schilling at fokus.fraunhofer.de	(work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Eric Schrock

2005-Nov-28 16:47 UTC

head link

[zfs-discuss] ZFS extremely slow

On Mon, Nov 28, 2005 at 05:38:41PM +0100, Joerg Schilling
wrote:> Eric Schrock <eric.schrock at sun.com> wrote:
> 
> > If I remember correctly, sync(1) and fsync(3c) actually work as
> > advertised on ZFS, and guarantee that data has been written to disk
> > before the syscall returns.  Thus, there will be more overhead than on
> > UFS (where the pages are flushed, but may or may not have reached
stable
> > storage).
> 
> Could you explian what you understand by stable storage?
The backing store for the filesystem, typically a disk.  Note that Neil
and Casper clarified some of my (invalid) assumptions.  For fsync(), UFS
does guarantee that pending writes will reach stable storage, but not
sync().  ZFS makes this guarantee for both.
> > I guess the question (as Jason points out) is what are you trying to
> > accomplish?  If you really want the data for each file on disk between
> > each extraction, then you are getting what you asked for on ZFS,
unlike
> > UFS.  I don''t see any value to the (broken) UFS semantics.
> 
> What does UFS?
I don''t know what this means, but it''s probably based on my
invalid
assumption as described above.  Neil is looking into the fsync()
performance issues.

- Eric

--
Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/eschrock

Joerg Schilling

2005-Nov-28 16:59 UTC

head link

[zfs-discuss] ZFS extremely slow

Casper.Dik at Sun.COM wrote:
> fsync() does guarantee that it won''t return until the
> data is on stable storage, unlink sync().
>
> However, UFS fsync is less than useful because it doesn''t
guarantee
> that the inode and the directory entry and the directory inode have
> reached stable storage.
With logging, this seems to be suficcient or did I miss something?
J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de		(uni)  
       schilling at fokus.fraunhofer.de	(work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Eric Schrock

2005-Nov-28 17:32 UTC

head link

[zfs-discuss] ZFS extremely slow

> 
> It is unclear to me why I see the oposite behavior.
> 
Just to be clear, you _are_ running this on non-DEBUG SX:CR 27a bits,
correct?  You''ll get rather different results when running on
OpenSolaris/BFU DEBUG bits thanks to our liberal use of kmem caches and
some ZFS debugging features which are turned on for DEBUG builds[1].

- Eric

[1] You can minimize these effects by setting ''kmem_flags'' and
    ''zfs_flags'' to zero, but you will still have various
overhead.
    Performance testing with DEBUG bits is never really valid.

--
Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/eschrock

Casper.Dik at Sun.COM

2005-Nov-28 18:17 UTC

head link

[zfs-discuss] ZFS extremely slow

>ZFS will push all data and meta data to the intent log by return
>from fsync(). The uber block will not be updated until the DMU
>transaction group commits - potentially seconds later.
>However, on power loss or panic, the intent log is replayed which
>will recreate the transactions and flush them by forcing the
>DMU transaction group to commit. So the uber block will contain
>the tree with the newly created file.
So what''s the preferred spelling for "uberblock" is it
"?ber", "ueber" (both correct german forms) or
"uber" (a hitherto
unknown word).  The pronunciation is rather different.

Casper

Eric Schrock

2005-Nov-28 18:46 UTC

head link

[zfs-discuss] ZFS extremely slow

On Mon, Nov 28, 2005 at 07:17:21PM +0100, Casper.Dik at sun.com
wrote:> 
> So what''s the preferred spelling for "uberblock" is it
> "?ber", "ueber" (both correct german forms) or
"uber" (a hitherto
> unknown word).  The pronunciation is rather different.
> 
The spelling without the umlaut is acceptable; the ''ue''
isn''t.  While a
bastardized English form, it''s hardly "hitherto unknown",
see:

http://en.wikipedia.org/wiki/%C3%9Cber

In particular, this phrase:

	?ber is commonly misspelled as uber in English, although the
	correct substitute for the ''?''-Umlaut would be ue, not just
''u''

In particular, the non-umlaut spelling is common in many informal/slang
settings, such as "ubercool".  See:

http://www.urbandictionary.com/define.php?term=uber

In ZFS, the non-umlaut version is the common form, since source code
doesn''t easily allow us the use of the real thing ;-)

- Eric

--
Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/eschrock

Casper.Dik at Sun.COM

2005-Nov-28 19:05 UTC

head link

[zfs-discuss] ZFS extremely slow

>The spelling without the umlaut is acceptable; the ''ue''
isn''t.  While a
>bastardized English form, it''s hardly "hitherto unknown",
see:
Why isn''t "ue" acceptable;
ue is the correct form as the wifipedia entry say ....

"Uber" is pronounced "oober" and writing it makes you look
silly
and uneducated.


Casper

Bill Sommerfeld

2005-Nov-28 19:38 UTC

head link

[zfs-discuss] ZFS extremely slow

On Mon, 2005-11-28 at 14:05, Casper.Dik at Sun.COM
wrote:> "Uber" is pronounced "oober" and writing it makes you
look silly
> and uneducated.
You expect otherwise from an English slang form, with a life of its own
distinct from the original German word?

To quote someone or other: "We don''t just borrow words; on
occasion,
English has pursued other languages down alleyways to beat them
unconscious and rifle their pockets for new vocabulary."

						- Bill

Bill Ross

2005-Nov-28 20:08 UTC

head link

[zfs-discuss] ZFS extremely slow

An ultimate irony is that English conjugations & declensions resemble 
those used in German, in the few cases where German imports words from 
foreign languages, except in the case of some 1000 core words (which I 
believe resemble German''s native rules). Thus English is like German if
it rifled other languages for its words. (This according to a linguist 
who gave a guest lecture to a biochem faculty at a prominent research 
university.)

Bill Ross

Bill Sommerfeld wrote:> On Mon, 2005-11-28 at 14:05, Casper.Dik at Sun.COM wrote:
> 
>>"Uber" is pronounced "oober" and writing it makes
you look silly
>>and uneducated.
> 
> 
> You expect otherwise from an English slang form, with a life of its own
> distinct from the original German word?
> 
> To quote someone or other: "We don''t just borrow words; on
occasion,
> English has pursued other languages down alleyways to beat them
> unconscious and rifle their pockets for new vocabulary."
> 
> 						- Bill
> 
> 
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://opensolaris.org/mailman/listinfo/zfs-discuss

Torrey McMahon

2005-Nov-28 21:05 UTC

head link

[zfs-discuss] ZFS extremely slow

Eric Schrock wrote:>
> The spelling without the umlaut is acceptable; the ''ue''
isn''t.  While a
> bastardized English form, it''s hardly "hitherto
unknown", see:
>
> http://en.wikipedia.org/wiki/%C3%9Cber
>
> In particular, this phrase:
>
> 	?ber is commonly misspelled as uber in English, although the
> 	correct substitute for the ''?''-Umlaut would be ue, not
just ''u''
>
> In particular, the non-umlaut spelling is common in many informal/slang
> settings, such as "ubercool".  See:
>
> http://www.urbandictionary.com/define.php?term=uber
>
> In ZFS, the non-umlaut version is the common form, since source code
> doesn''t easily allow us the use of the real thing ;-)
>
>   
I still think "Dr. Feelgood" is their best album. Wait....am I on the 
wrong list again?

-- 
Torrey McMahon
Sun Microsystems Inc.

Joerg Schilling

2005-Nov-29 14:02 UTC

head link

[zfs-discuss] ZFS extremely slow

Eric Schrock <eric.schrock at sun.com> wrote:
> Just to be clear, you _are_ running this on non-DEBUG SX:CR 27a bits,
> correct?  You''ll get rather different results when running on
As long as there is no official way to run non-debug kernels....
> OpenSolaris/BFU DEBUG bits thanks to our liberal use of kmem caches and
> some ZFS debugging features which are turned on for DEBUG builds[1].
>
> - Eric
>
> [1] You can minimize these effects by setting
''kmem_flags'' and
>     ''zfs_flags'' to zero, but you will still have various
overhead.
>     Performance testing with DEBUG bits is never really valid.
Whank you for the hint. I did set zfs_flags to 0 and this caused a slight
speedup.

The insteresting result is that the extraction was _really_ fast 
until the first big directory (''misc'') reached 440,000
entries.

Then it became as slow as before.

As a result, the star -xp -no-fsync extract did speed up from 
3:57 h to 3:00 h.

If there is a way to prevent the extreme slowdown after 440,000
entries, ZFS would be notocable faster than UFS!

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de		(uni)  
       schilling at fokus.fraunhofer.de	(work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Joerg Schilling

2005-Nov-29 14:22 UTC

head link

[zfs-discuss] ZFS extremely slow

Eric Schrock <eric.schrock at sun.com> wrote:
> In particular, the non-umlaut spelling is common in many informal/slang
> settings, such as "ubercool".  See:
>
> http://www.urbandictionary.com/define.php?term=uber
Something I would write and pronounce "obercool" 

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de		(uni)  
       schilling at fokus.fraunhofer.de	(work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily

Eric Schrock

2005-Nov-29 16:15 UTC

head link

[zfs-discuss] ZFS extremely slow

On Tue, Nov 29, 2005 at 03:02:50PM +0100, Joerg Schilling
wrote:> 
> As long as there is no official way to run non-debug kernels....
There is.  It''s called Solaris Express.  If you''re going to do
any
future performance testing for ZFS, please use theses non-DEBUG bits or
we''ll have no way to objectively analyze your results (unless, of
course, you''re trying to measure the DEBUG overhead).  Or just wait for
non-DEBUG opensolaris bits (should be coming shortly).  
> If there is a way to prevent the extreme slowdown after 440,000
> entries, ZFS would be notocable faster than UFS!
Noel is looking at some ZAP oddities with large directories (there are
some large performance jumps at various sizes).  But this could also be
an artifact of running DEBUG bits - nothing''s for certain when
you''re
doing performance testing on DEBUG bits.

- Eric

--
Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/eschrock

Roch Bourbonnais - Performance Engineering

2005-Nov-29 16:35 UTC

head link

[zfs-discuss] ZFS extremely slow

One hypothesis for the  performance delta would be that  ZFS
actually reclaims buffers  too agressively (and thus after a
while would need to go do disk). I am not sure how to reduce
this aggressiveness;  Anyone can provide  guidance here ? Is
adding swap space sufficient ?

-r

Bill Ross

2005-Nov-29 20:47 UTC

head link

[zfs-discuss] ZFS extremely slow

Joerg Schilling wrote:> Eric Schrock <eric.schrock at sun.com> wrote:
> 
> 
>>In particular, the non-umlaut spelling is common in many informal/slang
>>settings, such as "ubercool".  See:
>>
>>http://www.urbandictionary.com/define.php?term=uber
> 
> 
> Something I would write and pronounce "obercool" 
Just one step from almost the original meaning: overcool

Bill
> 
> J?rg
>

zfs discuss - Nov 2005 - ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] Re: ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow

[zfs-discuss] ZFS extremely slow