thr3ads.net - Lustre discuss - [Lustre-discuss] Maximum OST Size [Oct 2010]

If this information is useful, please help other people find it:
Share via:

Roger Spellman

2010-Oct-18 20:47 UTC

[Lustre-discuss] Maximum OST Size

In Lustre 2.x, what is the largest number of files that we could
possibly have?

 

I noticed that mkfs.lustre on the MDT passes the following parameters to
mkfs.ext2:  -i 4096 -I 512

 

Can these params be smaller?  

 

Can we get more inodes if we use zfs?

 

Thanks.

 

Roger Spellman

Staff Engineer

Terascala, Inc.

508-588-1501

www.terascala.com <http://www.terascala.com/>

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20101018/be2d8b12/attachment.html

Andreas Dilger

2010-Oct-18 20:59 UTC

head link

[Lustre-discuss] Maximum OST Size

On 2010-10-18, at 14:47, Roger Spellman wrote:> In Lustre 2.x, what is the largest number of files that we could possibly
have?
>  
> I noticed that mkfs.lustre on the MDT passes the following parameters to
mkfs.ext2:  -i 4096 -I 512
Do you mean the maximum OST size (as mentioned in the subject) or the maximum
MDT size (above)?  For the ext4-based ldiskfs the maximum size is 16TB and 4B
inodes (this listed in the manual).
> Can these params be smaller? 
For the MDT, yes, you could potentially use "-i 1500" as about the
minimum space per inode, but then you risk running out of space in the
filesystem before running out of inodes.  The "-I 512" parameter
controls the size of the inode itself, which holds the xattrs.  If there are
single-striped files and no use of ACLs, user_xattrs, etc. then you might get by
with "-I 256", but if this xattr space is exceeded then each such
inode will consume 4096 bytes of space and also be slower to access.
> Can we get more inodes if we use zfs?
Definitely yes.

Cheers, Andreas
--
Andreas Dilger
Lustre Technical Lead
Oracle Corporation Canada Inc.

Roger Spellman

2010-Oct-18 21:32 UTC

head link

[Lustre-discuss] Maximum MDT Size

Sorry, I meant max MDT size (I changed the subject).

I guess that my questions are:
1.  What is the maximum number of files I can get on an MDT with
ldiskfs?
2.  What parameters need to be modified to achieve this?
3.  What is the maximum number of files I can get on an MDT with ZFS?
4.  What parameters need to be modified to achieve this?
> -----Original Message-----
> From: Andreas Dilger [mailto:andreas.dilger at oracle.com]
> Sent: Monday, October 18, 2010 5:00 PM
> To: Roger Spellman
> Cc: lustre-discuss at lists.lustre.org
> Subject: Re: [Lustre-discuss] Maximum OST Size
> 
> On 2010-10-18, at 14:47, Roger Spellman wrote:
> > In Lustre 2.x, what is the largest number of files that we could
> possibly have?
> >
> > I noticed that mkfs.lustre on the MDT passes the following
parameters to> mkfs.ext2:  -i 4096 -I 512
> 
> Do you mean the maximum OST size (as mentioned in the subject) or the
> maximum MDT size (above)?  For the ext4-based ldiskfs the maximum size
is> 16TB and 4B inodes (this listed in the manual).
> 
> > Can these params be smaller?
> 
> For the MDT, yes, you could potentially use "-i 1500" as about
the
minimum> space per inode, but then you risk running out of space in the
filesystem> before running out of inodes.  The "-I 512" parameter controls
the
size of> the inode itself, which holds the xattrs.  If there are single-striped
> files and no use of ACLs, user_xattrs, etc. then you might get by with
"-I> 256", but if this xattr space is exceeded then each such inode will
> consume 4096 bytes of space and also be slower to access.
> 
> > Can we get more inodes if we use zfs?
> 
> Definitely yes.
> 
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Technical Lead
> Oracle Corporation Canada Inc.

Andreas Dilger

2010-Oct-19 22:15 UTC

head link

[Lustre-discuss] Maximum OST Size

On 2010-10-19, at 08:27, Roger Spellman wrote:> I don''t understand this comment:
>> For the MDT, yes, you could potentially use "-i 1500" as
about the
>> minimum space per inode, but then you risk running out of space in the
>> filesystem before running out of inodes.
> 
> If we set -I to 512, then on an MDT, what else is there that would cause
> require 1500 bytes per inode?
With "-I 512" that means the actual inode will consume 512 bytes, so
with "-i 1536" there would be 1024 bytes per inode of block space
still available.  That extra space is needed for everything else in the
filesystem, including the journal, directory blocks, Lustre metadata (last_rcvd,
distributed transaction logs, etc), and any external xattr blocks for
widely-striped files (beyond 12 stripes or so).
> Just ACLs and striping?  If there are no ACLs, and all files are
single-striped, then could both -i and -I be set to the same value, say 512?
No, this will cause mke2fs to fail.  There needs to be some free space in the
filesystem for the above filesysem/Lustre metadata.  In any case, since the
maximum number of inodes is 2^32 the total filesystem size is not the limiting
factor.
> Andreas Dilger wrote:
>> On 2010-10-18, at 14:47, Roger Spellman wrote:
>>> In Lustre 2.x, what is the largest number of files that we could
>>> possibly have?
>>> 
>>> I noticed that mkfs.lustre on the MDT passes the following
>>> parameters to mkfs.ext2:  -i 4096 -I 512
>> 
>> Do you mean the maximum OST size (as mentioned in the subject) or the
>> maximum MDT size (above)?  For the ext4-based ldiskfs the maximum size
>> is 16TB and 4B inodes (this listed in the manual).
>> 
>>> Can these params be smaller?
>> 
>> For the MDT, yes, you could potentially use "-i 1500" as
about the
>> minimum space per inode, but then you risk running out of space in the
>> filesystem before running out of inodes.  The "-I 512"
parameter controls
>> the size of the inode itself, which holds the xattrs.  If there are
>> single-striped files and no use of ACLs, user_xattrs, etc. then you
might
>> get by with "-I 256", but if this xattr space is exceeded
then each such
>> inode will consume 4096 bytes of space and also be slower to access.
>> 
>>> Can we get more inodes if we use zfs?
>> 
>> Definitely yes.
>> 
>> Cheers, Andreas
>> --
>> Andreas Dilger
>> Lustre Technical Lead
>> Oracle Corporation Canada Inc.
> 

Cheers, Andreas
--
Andreas Dilger
Lustre Technical Lead
Oracle Corporation Canada Inc.

Bernd Schubert

2010-Oct-20 17:00 UTC

head link

[Lustre-discuss] Maximum OST Size

On Wednesday, October 20, 2010, Andreas Dilger wrote:> On 2010-10-19, at 08:27, Roger Spellman wrote:
> > I don''t understand this comment:
> >> For the MDT, yes, you could potentially use "-i 1500" as
about the
> >> minimum space per inode, but then you risk running out of space in
the
> >> filesystem before running out of inodes.
> > 
> > If we set -I to 512, then on an MDT, what else is there that would
cause
> > require 1500 bytes per inode?
> 
> With "-I 512" that means the actual inode will consume 512 bytes,
so with
> "-i 1536" there would be 1024 bytes per inode of block space
still
> available.  That extra space is needed for everything else in the
> filesystem, including the journal, directory blocks, Lustre metadata
> (last_rcvd, distributed transaction logs, etc), and any external xattr
> blocks for widely-striped files (beyond 12 stripes or so).
> 
I have to admit, I entirely fail to understand why we should need 2/3 of the 
filesystem reserved for real file data.

- journal - 400MB -> negligible with recent decent MDT sizes (1TiB+)
- directory blocks, maybe, but I have noticed any system where that takes more 
than 5%
- Lustre metadata > (last_rcvd, distributed transaction logs, etc) -> 
negligible with recent decent MDT sizes

- external xattr for Lustre lov and additional ACLs: Maybe, depends on the 
customer

With the default -i 4096, it looks like that for most customers I know of:

df -h:
973G   57G  861G   7% /lustre/lustre/mdt

df -ih:
278M    248M     31M   89% /lustre/lustre/mdt


So doubling inode ratio to -i2048 or even quadrupling it to -i1024 seems to be 
recommendable. 


Cheers,
Bernd

Lustre discuss - Oct 2010 - Maximum OST Size

[Lustre-discuss] Maximum OST Size

[Lustre-discuss] Maximum OST Size

[Lustre-discuss] Maximum MDT Size

[Lustre-discuss] Maximum OST Size

[Lustre-discuss] Maximum OST Size