thr3ads.net - Ext3 users - ext3 efficiency, larger vs smaller file system, lots of inodes... [May 2009]

If this information is useful, please help other people find it:
Share via:

Joe Armstrong

2009-May-19 16:01 UTC

ext3 efficiency, larger vs smaller file system, lots of inodes...

(... to Nabble Ext3:Users - reposted by me after I joined the ext3-users mailing
list - sorry for the dup...)

A bit of a rambling subject there but I am trying to figure out if it is more
efficient at runtime to have few very large file systems (8 TB) vs a larger
number of smaller file systems.  The file systems will hold many small files.

My preference is to have a larger number of smaller file systems for faster
recovery and less impact if a problem does occur, but I was wondering if anybody
had information from a runtime performance perspective  - is there a difference
between few large and many small file systems ?  Is memory consumption higher
for the inode tables if there are more small ones vs one really large one ?

Also, does anybody have a reasonable formula for calculating memory requirements
of a given file system ?

Thanks.
Joe

Eric Sandeen

2009-May-19 16:21 UTC

head link

ext3 efficiency, larger vs smaller file system, lots of inodes...

Joe Armstrong wrote:> (... to Nabble Ext3:Users - reposted by me after I joined the
> ext3-users mailing list - sorry for the dup...)
> 
> A bit of a rambling subject there but I am trying to figure out if it
> is more efficient at runtime to have few very large file systems (8
> TB) vs a larger number of smaller file systems.  The file systems
> will hold many small files.
> 
> My preference is to have a larger number of smaller file systems for
> faster recovery and less impact if a problem does occur, but I was
> wondering if anybody had information from a runtime performance
> perspective  - is there a difference between few large and many small
> file systems ?  Is memory consumption higher for the inode tables if
> there are more small ones vs one really large one ?
It's the vfs that caches dentries & inodes; whether they come from
multiple filesystems or one should not change matters significantly.

The other downside to multiple smaller filesystems is space management,
when you wind up with half of them full and half of them empty, it may
be hard to rearrange.

But the extra granularity for better availability and fsck/recovery time
may be well worth it.  It probably depends on what your application is
doing and how it can manage the space.  You might want to test filling
an 8T filesystem and see for yourself how long fsck will take... it'll
be a while.  Perhaps a very long while.  :)
> Also, does anybody have a reasonable formula for calculating memory
> requirements of a given file system ?
Probably the largest memory footprint will be the cached dentries &
inodes, though this is a "soft" requirement since it's mostly just
cached.

Each journal probably has a bit of memory requirement overhead, but I
doubt it'll be a significant factor in your decision unless every byte
is at a premium...

-Eric
> Thanks. Joe

Joe Armstrong

2009-May-19 16:28 UTC

head link

ext3 efficiency, larger vs smaller file system, lots of inodes...

-----Original Message-----
From: Eric Sandeen [mailto:sandeen at redhat.com] 
Sent: Tuesday, May 19, 2009 9:21 AM
To: Joe Armstrong
Cc: ext3-users at redhat.com
Subject: Re: ext3 efficiency, larger vs smaller file system, lots of inodes...

Joe Armstrong wrote:> (... to Nabble Ext3:Users - reposted by me after I joined the
> ext3-users mailing list - sorry for the dup...)
> 
> A bit of a rambling subject there but I am trying to figure out if it
> is more efficient at runtime to have few very large file systems (8
> TB) vs a larger number of smaller file systems.  The file systems
> will hold many small files.
> 
> My preference is to have a larger number of smaller file systems for
> faster recovery and less impact if a problem does occur, but I was
> wondering if anybody had information from a runtime performance
> perspective  - is there a difference between few large and many small
> file systems ?  Is memory consumption higher for the inode tables if
> there are more small ones vs one really large one ?
It's the vfs that caches dentries & inodes; whether they come from
multiple filesystems or one should not change matters significantly.

The other downside to multiple smaller filesystems is space management,
when you wind up with half of them full and half of them empty, it may
be hard to rearrange.

But the extra granularity for better availability and fsck/recovery time
may be well worth it.  It probably depends on what your application is
doing and how it can manage the space.  You might want to test filling
an 8T filesystem and see for yourself how long fsck will take... it'll
be a while.  Perhaps a very long while.  :)
> Also, does anybody have a reasonable formula for calculating memory
> requirements of a given file system ?
Probably the largest memory footprint will be the cached dentries &
inodes, though this is a "soft" requirement since it's mostly just
cached.

Each journal probably has a bit of memory requirement overhead, but I
doubt it'll be a significant factor in your decision unless every byte
is at a premium...

-Eric
> Thanks. Joe
OK, it sounds like it is mostly a space management issue rather than a
performance issue.  FWIW, the space management issue we were planning on
managing via LVM and allocating some medium size volumes to start with and leave
lots of spare extents unallocated and then just grow the volume/fs as needed.

Thanks.

Joe

Joe Armstrong

2009-May-19 17:08 UTC

head link

ext3 efficiency, larger vs smaller file system, lots of inodes...

> -----Original Message-----
> From: Ric Wheeler [mailto:rwheeler at redhat.com]
> Sent: Tuesday, May 19, 2009 9:54 AM
> To: Joe Armstrong
> Cc: ext3-users at redhat.com
> Subject: Re: ext3 efficiency, larger vs smaller file system, lots of
> inodes...
> 
> On 05/19/2009 12:28 PM, Joe Armstrong wrote:
> >
> > -----Original Message-----
> > From: Eric Sandeen [mailto:sandeen at redhat.com]
> > Sent: Tuesday, May 19, 2009 9:21 AM
> > To: Joe Armstrong
> > Cc: ext3-users at redhat.com
> > Subject: Re: ext3 efficiency, larger vs smaller file system, lots of
> inodes...
> >
> > Joe Armstrong wrote:
> >> (... to Nabble Ext3:Users - reposted by me after I joined the
> >> ext3-users mailing list - sorry for the dup...)
> >>
> >> A bit of a rambling subject there but I am trying to figure out if
> it
> >> is more efficient at runtime to have few very large file systems
(8
> >> TB) vs a larger number of smaller file systems.  The file systems
> >> will hold many small files.
> >>
> >> My preference is to have a larger number of smaller file systems
for
> >> faster recovery and less impact if a problem does occur, but I was
> >> wondering if anybody had information from a runtime performance
> >> perspective  - is there a difference between few large and many
> small
> >> file systems ?  Is memory consumption higher for the inode tables
if
> >> there are more small ones vs one really large one ?
> >
> > It's the vfs that caches dentries&  inodes; whether they come
from
> > multiple filesystems or one should not change matters significantly.
> >
> > The other downside to multiple smaller filesystems is space
> management,
> > when you wind up with half of them full and half of them empty, it
> may
> > be hard to rearrange.
> >
> > But the extra granularity for better availability and fsck/recovery
> time
> > may be well worth it.  It probably depends on what your application
> is
> > doing and how it can manage the space.  You might want to test
> filling
> > an 8T filesystem and see for yourself how long fsck will take...
> it'll
> > be a while.  Perhaps a very long while.  :)
> >
> >> Also, does anybody have a reasonable formula for calculating
memory
> >> requirements of a given file system ?
> >
> > Probably the largest memory footprint will be the cached dentries&
> > inodes, though this is a "soft" requirement since it's
mostly just
> cached.
> >
> > Each journal probably has a bit of memory requirement overhead, but I
> > doubt it'll be a significant factor in your decision unless every
> byte
> > is at a premium...
> >
> > -Eric
> >
> 
> How you do this also depends on the type of storage you use. If you
> have
> multiple file systems on one physical disk (say 2 1TB partitions on a
> 2TB S-ATA
> disk), you need to be careful not to bash on both file systems at once
> since you
> will thrash the disk heads.
> 
> In general, it is less of an issue with arrays, but still can have a
> performance
> impact.
> 
> Ric
Just for completeness, we will be using Striped LUN's (RAID-6 underneath),
so I hope that the striping will distribute the IO's while the RAID-6 device
will provide the HA/recovery capabilities.

Joe

Theodore Tso

2009-May-19 17:47 UTC

head link

ext3 efficiency, larger vs smaller file system, lots of inodes...

On Tue, May 19, 2009 at 09:01:47AM -0700, Joe Armstrong
wrote:> 
> A bit of a rambling subject there but I am trying to figure out if
> it is more efficient at runtime to have few very large file systems
> (8 TB) vs a larger number of smaller file systems.  The file systems
> will hold many small files.
No, it's not really more efficient to have large filesystems ---
efficiency at least in terms of performance, that is.  In fact,
depending on your workload, it sometimes can be more efficiency to
have smaller filesystems, since it the journal is a single choke-point
if you have a fsync-heavy workload.  Other advantages of smaller
filesystems is that it's faster to fsck a particular filesystem.

The disadvantage of breaking up a large filesystem are the obvious
ones; you have less flexibility about space allocation, and you can't
hard link across different filesystems, which can be a big deal for
some folks.
> Is memory consumption higher for the inode tables if
> there are more small ones vs one really large one ?
No, because we don't keep a entire filesystem inode table in memory;
pieces of it are brought in as needed, and when they aren't needed
they are released from memory.  About the only thing which is
permanently pinned into memory are the block group descriptors, which
take up 32 bytes per block group descriptor, where a block group
descriptor represents 32 megabytes of storage on disk.  So 1 GB of
filesystem will require 1k of space, and a 1TB filesystem will require
1 megabyte of memory in terms of block group descriptors.  There are
some other overheads, but most of them are fixed overheads, and
normally not a problem.  The struct superblock data structure a
kilobyte or so, for example.  The buffer heads for the block group
descriptors are 56 bytes per 4k of block group descriptors, so 1
megabytes of block grouptors also requires 14k of buffer heads.

Unless you're creating some kind of embedded NAS system, I doubt
memory consuption will be a major problem for you.

       		       	    	  	      - Ted

Reasonably Related Threads

Search for more possibly parallel threads

Ext3 users - May 2009 - ext3 efficiency, larger vs smaller file system, lots of inodes...

ext3 efficiency, larger vs smaller file system, lots of inodes...

ext3 efficiency, larger vs smaller file system, lots of inodes...

ext3 efficiency, larger vs smaller file system, lots of inodes...

ext3 efficiency, larger vs smaller file system, lots of inodes...

ext3 efficiency, larger vs smaller file system, lots of inodes...

Reasonably Related Threads