thr3ads.net - zfs discuss - [zfs-discuss] Re: [nfs-discuss] Multi-tera, small-file filesystems [Apr 2007]

If this information is useful, please help other people find it:
Share via:

eric kustarz

2007-Apr-23 23:43 UTC

[zfs-discuss] Re: [nfs-discuss] Multi-tera, small-file filesystems

On Apr 18, 2007, at 6:44 AM, Yaniv Aknin wrote:
> Hello,
>
> I''d like to plan a storage solution for a system currently in  
> production.
>
> The system''s storage is based on code which writes many files to  
> the file system, with overall storage needs currently around 40TB  
> and expected to reach hundreds of TBs. The average file size of the  
> system is ~100K, which translates to ~500 million files today, and  
> billions of files in the future. This storage is accessed over NFS  
> by a rack of 40 Linux blades, and is mostly read-only (99% of the  
> activity is reads). While I realize calling this sub-optimal system  
> design is probably an understatement, the design of the system is  
> beyond my control and isn''t likely to change in the near future.
>
> The system''s current storage is based on 4 VxFS filesystems,  
> created on SVM meta-devices each ~10TB in size. A 2-node Sun  
> Cluster serves the filesystems, 2 filesystems per node. Each of the  
> filesystems undergoes growfs as more storage is made available.  
> We''re looking for an alternative solution, in an attempt to
improve
> performance and ability to recover from disasters (fsck on 2^42  
> files isn''t practical, and I''m getting pretty worried due
to this
> fact - even the smallest filesystem inconsistency will leave me  
> lots of useless bits).
>
> Question is - does anyone here have experience with large ZFS  
> filesystems with many small-files? Is it practical to base such a  
> solution on a few (8) zpools, each with single large filesystem in it?
hey Yaniv,

Why not 1 pool?  That''s what we usually recommend (you can have 8  
filesystems on top of the 1 pool if you need to).

eric

Leon Koll

2007-Apr-24 00:24 UTC

head link

[zfs-discuss] Re: [nfs-discuss] Multi-tera, small-file filesystems

> 
> On Apr 18, 2007, at 6:44 AM, Yaniv Aknin wrote:
> 
> > Hello,
> >
> > I''d like to plan a storage solution for a system
> currently in  
> > production.
> >
> > The system''s storage is based on code which writes
> many files to  
> > the file system, with overall storage needs
> currently around 40TB  
> > and expected to reach hundreds of TBs. The average
> file size of the  
> > system is ~100K, which translates to ~500 million
> files today, and  
> > billions of files in the future. This storage is
> accessed over NFS  
> > by a rack of 40 Linux blades, and is mostly
> read-only (99% of the  
> > activity is reads). While I realize calling this
> sub-optimal system  
> > design is probably an understatement, the design of
> the system is  
> > beyond my control and isn''t likely to change in the
> near future.
> >
> > The system''s current storage is based on 4 VxFS
> filesystems,  
> > created on SVM meta-devices each ~10TB in size. A
> 2-node Sun  
> > Cluster serves the filesystems, 2 filesystems per
> node. Each of the  
> > filesystems undergoes growfs as more storage is
> made available.  
> > We''re looking for an alternative solution, in an
> attempt to improve  
> > performance and ability to recover from disasters
> (fsck on 2^42  
> > files isn''t practical, and I''m getting pretty
> worried due to this  
> > fact - even the smallest filesystem inconsistency
> will leave me  
> > lots of useless bits).
> >
> > Question is - does anyone here have experience with
> large ZFS  
> > filesystems with many small-files? Is it practical
> to base such a  
> > solution on a few (8) zpools, each with single
> large filesystem in it?
> 
> hey Yaniv,
> 
> Why not 1 pool?  That''s what we usually recommend
> (you can have 8  
> filesystems on top of the 1 pool if you need to).
> 
> eric
My guess that Yaniv assumes that 8 pools with 62.5 million files each have
significantly less chances to be corrupted/cause the data loss than 1 pool with
500 million files in it.
Do you agree with this?

TIA,
-- leon
 
 
This message posted from opensolaris.org

Richard Elling

2007-Apr-24 00:37 UTC

head link

[zfs-discuss] Re: [nfs-discuss] Multi-tera, small-file filesystems

Leon Koll wrote:> My guess that Yaniv assumes that 8 pools with 62.5 million files each have
significantly less chances to be corrupted/cause the data loss than 1 pool with
500 million files in it.
> Do you agree with this?
I do not agree with this statement.  The probability is the same,
regardless of the number of files.  By analogy, if I have 100 people
and the risk of heart attack is 0.1%/year/person, then dividing those
people into groups does not change their risk of heart attack.
  -- richard

Leon Koll

2007-Apr-24 00:51 UTC

head link

[zfs-discuss] Re: Re: [nfs-discuss] Multi-tera, small-file filesystems

> Leon Koll wrote:
> > My guess that Yaniv assumes that 8 pools with 62.5
> million files each have significantly less chances to
> be corrupted/cause the data loss than 1 pool with 500
> million files in it.
> > Do you agree with this?
> 
> I do not agree with this statement.  The probability
> is the same,
> regardless of the number of files.  By analogy, if I
> have 100 people
> and the risk of heart attack is 0.1%/year/person,
> then dividing those
> people into groups does not change their risk of
> heart attack.
>   -- richard
My analogy was - to put these 100 people to 8 elevators instead of one,
especially in case when one elevator can carry only 13 people.
But if you tell me that the risk of dealing with 500 million files in 1 pool is
the same as with  500 million files in 8 pools, I agree that my analogy is not
relevant.
 
 
This message posted from opensolaris.org

Anton B. Rang

2007-Apr-24 01:27 UTC

head link

[zfs-discuss] Re: Re: [nfs-discuss] Multi-tera, small-file filesystems

However, the MTTR is likely to be 1/8 the time....
 
 
This message posted from opensolaris.org

Gavin Maltby

2007-Apr-26 09:37 UTC

head link

[zfs-discuss] Re: [nfs-discuss] Multi-tera, small-file filesystems

On 04/24/07 01:37, Richard Elling wrote:> Leon Koll wrote:
>> My guess that Yaniv assumes that 8 pools with 62.5 million files each 
>> have significantly less chances to be corrupted/cause the data loss 
>> than 1 pool with 500 million files in it.
>> Do you agree with this?
> 
> I do not agree with this statement.  The probability is the same,
> regardless of the number of files.  By analogy, if I have 100 people
> and the risk of heart attack is 0.1%/year/person, then dividing those
> people into groups does not change their risk of heart attack.
Is that not because heart attacks in different people are (under normal
circumstances!) independent events.  8 filesystems backed by a single
pool are not independent;  8 filesystems from 8 distinct pools are a lot
more independent.

Gavin

Seemingly Similar Threads

Search for more seemingly similar threads

zfs discuss - Apr 2007 - Re: [nfs-discuss] Multi-tera, small-file filesystems

[zfs-discuss] Re: [nfs-discuss] Multi-tera, small-file filesystems

[zfs-discuss] Re: [nfs-discuss] Multi-tera, small-file filesystems

[zfs-discuss] Re: [nfs-discuss] Multi-tera, small-file filesystems

[zfs-discuss] Re: Re: [nfs-discuss] Multi-tera, small-file filesystems

[zfs-discuss] Re: Re: [nfs-discuss] Multi-tera, small-file filesystems

[zfs-discuss] Re: [nfs-discuss] Multi-tera, small-file filesystems

Seemingly Similar Threads