thr3ads.net - zfs discuss - [zfs-discuss] Is ZFS efficient for large collections of small files? [Aug 2007]

If this information is useful, please help other people find it:
Share via:

Brandorr

2007-Aug-21 04:01 UTC

[zfs-discuss] Is ZFS efficient for large collections of small files?

Is ZFS efficient at handling huge populations of tiny-to-small files -
for example, 20 million TIFF images in a collection, each between 5
and 500k in size?

I am asking because I could have sworn that I read somewhere that it
isn''t, but I can''t find the reference.

Thanks,
Brian
-- 
- Brian Gupta

http://opensolaris.org/os/project/nycosug/

Matthew Ahrens

2007-Aug-21 04:24 UTC

head link

[zfs-discuss] Is ZFS efficient for large collections of small files?

Brandorr wrote:> Is ZFS efficient at handling huge populations of tiny-to-small files -
> for example, 20 million TIFF images in a collection, each between 5
> and 500k in size?
Do you mean efficient in terms of space used?  If so, then in general it is 
quite efficient.  Eg, files < 128k space is rounded up to only a multiple of 
512 bytes.  Around 1k of metadata is consumed per file.

There are however a few cases where it will not be optimal.  Eg, 129k files 
will use up 256k of space.  However, you can work around this problem by 
turning on compression.
> I am asking because I could have sworn that I read somewhere that it
> isn''t, but I can''t find the reference.
If you find it, let us know.

--matt

Ralf Ramge

2007-Aug-21 11:37 UTC

head link

[zfs-discuss] Is ZFS efficient for large collections of small files?

Brandorr wrote:> Is ZFS efficient at handling huge populations of tiny-to-small files -
> for example, 20 million TIFF images in a collection, each between 5
> and 500k in size?
>
> I am asking because I could have sworn that I read somewhere that it
> isn''t, but I can''t find the reference.
>   If you''re worried about the I/O throughput, you should avoid RAIDZ1/2 
configurations. random read performance will be desastrous if you do; 
I''ve seen random reads ratios with less than 1 MB/s on a X4500 with 40 
dedicated disks for data storage. If you don''t have to worry about disk
space, use mirrors;  I got my best results during my extensive X4500 
benchmarking sessions, when I mirrored single slices instead of complete 
disks (resulting in 40 2-way-mirrors on 40 physical discs, mirroring 
c0t0d0s0->c0t1d0s1 and c0t1d0s0->c0t0d0s1, and so on). If you''re
worried
about disk space,  you should consider striping several instances of 
RAIDZ1 arrays, each one consisting of three discs or slices. sequential 
access will  go down the cliff,  but random reads will be boosted.

You should also adjust the recordsize. Try to measure the average I/O 
transaction size. There''s a good chance that your I/O performance will 
be best if you set your recordsize to a smaller value. For instance, if 
your average file size is 12 KB, try using 8K or even 4K recordsize, 
stay away from 16K or higher.

-- 

Ralf Ramge
Senior Solaris Administrator, SCNA, SCSA

Tel. +49-721-91374-3963 
ralf.ramge at webde.de - http://web.de/

1&1 Internet AG
Brauerstra?e 48
76135 Karlsruhe

Amtsgericht Montabaur HRB 6484

Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger,
Matthias Greve, Robert Hoffmann, Norbert Lang, Achim Weiss
Aufsichtsratsvorsitzender: Michael Scheeren

Mario Goebbels

2007-Aug-21 12:39 UTC

head link

[zfs-discuss] Is ZFS efficient for large collections of small files?

> There are however a few cases where it will not be optimal.  Eg, 129k files
> will use up 256k of space.  However, you can work around this problem by 
> turning on compression.
Doesn''t ZFS pack the last block into one of a multiple of 512?

If not, it''s a surprise that there isn''t a pseudo-compression
mode
available to deal with that.

-mg

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 648 bytes
Desc: OpenPGP digital signature
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070821/b5bf1c2d/attachment.bin>

Łukasz K

2007-Aug-21 13:26 UTC

head link

[zfs-discuss] Odp: Is ZFS efficient for large collections of small files?

> Is ZFS efficient at handling huge populations of tiny-to-small files -
> for example, 20 million TIFF images in a collection, each between 5
> and 500k in size?
> 
> I am asking because I could have sworn that I read somewhere that it
> isn''t, but I can''t find the reference.
It depends, what type of I/O you will do. If only reads, there is no 
problem. Writting small files ( and removing ) will fragmentate pool
and it will be a huge problem.
You can set recordsize to 32k ( or 16k ) and it will help for some time.

Lukas

----------------------------------------------------
CLUBNETIC SUMMER PARTY 2007
House, club, electro. Najlepsza kompilacja na letnie imprezy!
http://klik.wp.pl/?adr=http%3A%2F%2Fadv.reklama.wp.pl%2Fas%2Fclubnetic.html&sid=1266

Eric Schrock

2007-Aug-21 15:40 UTC

head link

[zfs-discuss] Is ZFS efficient for large collections of small files?

On Tue, Aug 21, 2007 at 02:39:00PM +0200, Mario Goebbels
wrote:> > There are however a few cases where it will not be optimal.  Eg, 129k
files
> > will use up 256k of space.  However, you can work around this problem
by
> > turning on compression.
> 
> Doesn''t ZFS pack the last block into one of a multiple of 512?
> 
> If not, it''s a surprise that there isn''t a
pseudo-compression mode
> available to deal with that.
> 
> -mg
> 
This would certainly be nice.  See:

6279263 We should have a "zeroes" compression algorithm

- Eric

--
Eric Schrock, Solaris Kernel Development       http://blogs.sun.com/eschrock

Brandorr

2007-Aug-21 15:52 UTC

head link

[zfs-discuss] Is ZFS efficient for large collections of small files?

On 8/21/07, Matthew Ahrens <Matthew.Ahrens at sun.com>
wrote:> Brandorr wrote:
> > Is ZFS efficient at handling huge populations of tiny-to-small files -
> > for example, 20 million TIFF images in a collection, each between 5
> > and 500k in size?
>
> Do you mean efficient in terms of space used?  If so, then in general it is
> quite efficient.  Eg, files < 128k space is rounded up to only a
multiple of
> 512 bytes.  Around 1k of metadata is consumed per file.
>
> There are however a few cases where it will not be optimal.  Eg, 129k files
> will use up 256k of space.  However, you can work around this problem by
> turning on compression.
You answer part of my question perfectly. (Regarding space
utilization.) The other was related to performance, but someone else
has answered that.
> > I am asking because I could have sworn that I read somewhere that it
> > isn''t, but I can''t find the reference.
>
> If you find it, let us know.
It turns out, what I read was related to the fact that RAID-Z was a
suboptimal volume layout for the "large amounts of small files" use
case. (I you still want, I can probably find it, as I think it was an
opensolaris.org discussion thread.) (Richard reminded me of this.)

One issue, the person looking at doing this has 8 x 750GB drives, so
some sort of parity based raid striping will be required. Being that
this will be suboptimal for any file system, I think the performance
impact can be dealt with.

I''d like to thank everyone for their responses. Based on the
discussion we''ve had and my reviews of the XFS and ReiserFS file
systems, I feel very confident recommending ZFS as a superior
alternative. (ZFS data integrity, scalability and compression are the
key winners here.). (Now I just need to research compatibility between
Linux and Solaris NFS and whether his 3Ware card works with
OpenSolaris.

Thanks,
Brian

P.S. - Is there a ZFS FAQ somewhere?

-- 
- Brian Gupta

http://opensolaris.org/os/project/nycosug/

Cindy.Swearingen at Sun.COM

2007-Aug-21 15:57 UTC

head link

[zfs-discuss] Is ZFS efficient for large collections of small files?

The OpenSolaris ZFS FAQ is here:

http://www.opensolaris.org/os/community/zfs/faq

Other resources are listed here:

http://www.opensolaris.org/os/community/zfs/links/

Cindy

Brandorr wrote:
> P.S. - Is there a ZFS FAQ somewhere?
>

Matthew Ahrens

2007-Aug-21 19:28 UTC

head link

[zfs-discuss] Is ZFS efficient for large collections of small files?

Mario Goebbels wrote:>> There are however a few cases where it will not be optimal.  Eg, 129k
files
>> will use up 256k of space.  However, you can work around this problem
by
>> turning on compression.
> 
> Doesn''t ZFS pack the last block into one of a multiple of 512?
Unfortunately, not yet.  See:

5003563 use smaller "tail block" for last block of object

--matt

Roch - PAE

2007-Aug-22 08:49 UTC

head link

[zfs-discuss] Is ZFS efficient for large collections of small files?

Brandorr wrote:
  > Is ZFS efficient at handling huge populations of tiny-to-small files -
  > for example, 20 million TIFF images in a collection, each between 5
  > and 500k in size?
  >
  > I am asking because I could have sworn that I read somewhere that it
  > isn''t, but I can''t find the reference.
  >   
  If you''re worried about the I/O throughput, you should avoid RAIDZ1/2
  configurations. random read performance will be desastrous if you do; 

A raid-z group  can do one random  read per I/O latency.  So
for 8 disks (each capable of 200 IOPS) in a zpool split into
2  raid-z groups  should  be able  to  server 400  files per
second. If you need to serve more  files, then you need more
disks or  need to use  mirroring. With mirroring, I''d expect
to   serve 1600 files (8*200).  This  model  only applies to
random reading, not sequential access,  not to any types  of
write loads.

For  small file creation ZFS can   be extremely efficient in
that it can create more than 1  file per I/O. It should also
approach disk streaming performance for write loads.

  I''ve seen random reads ratios with less than 1 MB/s on a X4500 with
40
  dedicated disks for data storage. 

It would  be nice to  see  if the  above model matches  your
data. So if you have  all 40 disks  in a single raid-z group
(an anti  best  practice) I''d  expect <200  files served per
second and if the files were of 5K avg  size then I''d expect
that 1MB/sec.

http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide,

  If you don''t have to worry about disk 
  space, use mirrors;  

right on !

  I got my best results during my extensive X4500 
  benchmarking sessions, when I mirrored single slices instead of complete 
  disks (resulting in 40 2-way-mirrors on 40 physical discs, mirroring 
  c0t0d0s0->c0t1d0s1 and c0t1d0s0->c0t0d0s1, and so on). If
you''re worried
  about disk space,  you should consider striping several instances of 
  RAIDZ1 arrays, each one consisting of three discs or slices. sequential 
  access will  go down the cliff,  but random reads will be boosted.

Writes should be good if not great, no matter what the
workload is. I''m interested in data that shows otherwise.

  You should also adjust the recordsize. 

For small files I certainly would not. 
Small files are stored as single record when they are
smaller than the recordsize. Single record is good in my
book. Not sure when one would want otherwise for small files.

  Try to measure the average I/O 
  transaction size. There''s a good chance that your I/O performance
will
  be best if you set your recordsize to a smaller value. For instance, if 
  your average file size is 12 KB, try using 8K or even 4K recordsize, 
  stay away from 16K or higher.

Tuning the record size is currently only recommended for
databases (large file) with fixed record access. Again it''s
interesting input if tuning the recordsize helped another
type of workload.

-r

  -- 

  Ralf Ramge
  Senior Solaris Administrator, SCNA, SCSA

  Tel. +49-721-91374-3963 
  ralf.ramge at webde.de - http://web.de/

  1&1 Internet AG
  Brauerstra?e 48
  76135 Karlsruhe

  Amtsgericht Montabaur HRB 6484

  Vorstand: Henning Ahlert, Ralph Dommermuth, Matthias Ehrlich, Andreas Gauger,
Matthias Greve, Robert Hoffmann, Norbert Lang, Achim Weiss
  Aufsichtsratsvorsitzender: Michael Scheeren

  _______________________________________________
  zfs-discuss mailing list
  zfs-discuss at opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Roch - PAE

2007-Aug-22 09:04 UTC

head link

[zfs-discuss] Odp: Is ZFS efficient for large collections of small files?

?ukasz K writes:
 > > Is ZFS efficient at handling huge populations of tiny-to-small files
-
 > > for example, 20 million TIFF images in a collection, each between 5
 > > and 500k in size?
 > > 
 > > I am asking because I could have sworn that I read somewhere that it
 > > isn''t, but I can''t find the reference.
 > 
 > It depends, what type of I/O you will do. If only reads, there is no 
 > problem. Writting small files ( and removing ) will fragmentate pool
 > and it will be a huge problem.
 > You can set recordsize to 32k ( or 16k ) and it will help for some time.
 > 

Comparing recordsize of 16K with 128K.

	Files in the range of [0,16K]     : no difference.
	Files in the range of [16K,128K]  : more efficient to use 128K
	Files in the range of [128K,500K] : more efficient to use 16K

In the [16K,128K] range the actual filesize is rounded up to 
16K with 16K recordsize and to the nearest 512B boundary
with 128K recordsize. This will be fairly catastrophic for
files slightly above 16K (rounded up to 32K vs 16K+512B).

In the [128K, 500K] range we''re hurt by this

	5003563 use smaller "tail block" for last block of object

until   it is  fixed, then  yes , files stored using  16K
records are  rounded up more tightly. metadata probably
east parts of the gains.

-r


 > Lukas
 > 
 > ----------------------------------------------------
 > CLUBNETIC SUMMER PARTY 2007
 > House, club, electro. Najlepsza kompilacja na letnie imprezy!
 >
http://klik.wp.pl/?adr=http%3A%2F%2Fadv.reklama.wp.pl%2Fas%2Fclubnetic.html&sid=1266
 > 
 > 
 > _______________________________________________
 > zfs-discuss mailing list
 > zfs-discuss at opensolaris.org
 > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Roch - PAE

2007-Aug-22 09:13 UTC

head link

[zfs-discuss] Odp: Is ZFS efficient for large collections of small files?

?ukasz K writes:
 > > Is ZFS efficient at handling huge populations of tiny-to-small files
-
 > > for example, 20 million TIFF images in a collection, each between 5
 > > and 500k in size?
 > > 
 > > I am asking because I could have sworn that I read somewhere that it
 > > isn''t, but I can''t find the reference.
 > 
 > It depends, what type of I/O you will do. If only reads, there is no 
 > problem. Writting small files ( and removing ) will fragmentate pool
 > and it will be a huge problem.
 > You can set recordsize to 32k ( or 16k ) and it will help for some time.
 > 

Comparing recordsize of 16K with 128K.

	Files in the range of [0,16K]     : no difference.
	Files in the range of [16K,128K]  : more efficient to use 128K
	Files in the range of [128K,500K] : more efficient to use 16K

In the [16K,128K] range the actual filesize is rounded up to 
16K with 16K recordsize and to the nearest 512B boundary
with 128K recordsize. This will be fairly catastrophic for
files slightly above 16K (rounded up to 32K vs 16K+512B).

In the [128K, 500K] range we''re hurt by this

	5003563 use smaller "tail block" for last block of object

until   it is  fixed, then  yes , files stored using  16K
records are  rounded up more tightly. metadata probably
east parts of the gains.

-r


 > Lukas
 > 
 > ----------------------------------------------------
 > CLUBNETIC SUMMER PARTY 2007
 > House, club, electro. Najlepsza kompilacja na letnie imprezy!
 >
http://klik.wp.pl/?adr=http%3A%2F%2Fadv.reklama.wp.pl%2Fas%2Fclubnetic.html&sid=1266
 > 
 > 
 > _______________________________________________
 > zfs-discuss mailing list
 > zfs-discuss at opensolaris.org
 > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Robert Milkowski

2007-Aug-23 07:30 UTC

head link

[zfs-discuss] Odp: Is ZFS efficient for large collections of small files?

Hello Roch,

Wednesday, August 22, 2007, 10:13:10 AM, you wrote:

RP> ?ukasz K writes:
 >> > Is ZFS efficient at handling huge populations of tiny-to-small
files -
 >> > for example, 20 million TIFF images in a collection, each between
5
 >> > and 500k in size?
 >> > 
 >> > I am asking because I could have sworn that I read somewhere that
it
 >> > isn''t, but I can''t find the reference.
 >> 
 >> It depends, what type of I/O you will do. If only reads, there is no 
 >> problem. Writting small files ( and removing ) will fragmentate pool
 >> and it will be a huge problem.
 >> You can set recordsize to 32k ( or 16k ) and it will help for some
time.
 >> 

RP> Comparing recordsize of 16K with 128K.

RP>         Files in the range of [0,16K]     : no difference.
RP>         Files in the range of [16K,128K]  : more efficient to use 128K
RP>         Files in the range of [128K,500K] : more efficient to use 16K

RP> In the [16K,128K] range the actual filesize is rounded up to 
RP> 16K with 16K recordsize and to the nearest 512B boundary
RP> with 128K recordsize. This will be fairly catastrophic for
RP> files slightly above 16K (rounded up to 32K vs 16K+512B).

RP> In the [128K, 500K] range we''re hurt by this

RP>         5003563 use smaller "tail block" for last block of
object

RP> until   it is  fixed, then  yes , files stored using  16K
RP> records are  rounded up more tightly. metadata probably
RP> east parts of the gains.

Roch, I guess Lukasz was talking about some problems we''re seeing here
which are partly caused by utilizing all 128KB slabs, so forcing file
system to 16KB helps here (for CPU) - workaround. Sure, we''re talking
about lots and lots of files, really small. Perhaps someone could work
with Lukasz and investigate it more closely. Lukasz posted so more
detailed info not so long ago - unfortunately there was no feedback.

-- 
Best regards,
 Robert Milkowski                      mailto:rmilkowski at task.gda.pl
                                       http://milek.blogspot.com

Possibly Parallel Threads

Search for more apparently analagous threads

zfs discuss - Aug 2007 - Is ZFS efficient for large collections of small files?

[zfs-discuss] Is ZFS efficient for large collections of small files?

[zfs-discuss] Is ZFS efficient for large collections of small files?

[zfs-discuss] Is ZFS efficient for large collections of small files?

[zfs-discuss] Is ZFS efficient for large collections of small files?

[zfs-discuss] Odp: Is ZFS efficient for large collections of small files?

[zfs-discuss] Is ZFS efficient for large collections of small files?

[zfs-discuss] Is ZFS efficient for large collections of small files?

[zfs-discuss] Is ZFS efficient for large collections of small files?

[zfs-discuss] Is ZFS efficient for large collections of small files?

[zfs-discuss] Is ZFS efficient for large collections of small files?

[zfs-discuss] Odp: Is ZFS efficient for large collections of small files?

[zfs-discuss] Odp: Is ZFS efficient for large collections of small files?

[zfs-discuss] Odp: Is ZFS efficient for large collections of small files?

Possibly Parallel Threads