thr3ads.net - Lustre discuss - [Lustre-discuss] Redundancy on OSS/OST nodes [Jan 2007]

If this information is useful, please help other people find it:
Share via:

Ramon van Alteren

2007-Jan-29 02:04 UTC

[Lustre-discuss] Redundancy on OSS/OST nodes

Hi,

I hope this is the correct list, and the question hasn''t been asked
before.

I''ve read through most of the material on the wiki and the website and
I
am currently in the process of building a proof-of-concept lustre cluster.
One thing isn''t clear about the aggregated throughput figures in the
FAQ: http://www.clusterfs.com/faq.html

Stated throughput on a 64-bit linux OSS in the FAQ is:
 Dual-NIC gig-e on a 64-bit OSS: 220 MB/s

We''re hoping to use these OSS''s to provide access to a large
collection
of rather small files.
Most of them are rendered images in various formats, typical filesize
range is 1Kb -  100Kb

The total volume is expected to grow well over the 10Tb over time,
we''re
currently at 4TB.
We would like to achieve the fastest throughput possible.

If I understood everything correctly parts of a file can / will be
stored on multiple OSS/OST''s ?
Because of this aggregated throughput for a single file can be higher
than the max-throughput per OSS ?
Wat is the smallest element of a file that can be spread over multiple
OSS''s/OST''s ?

Best Regards,

Ramon van Alteren

Daniel Leaberry

2007-Jan-29 08:20 UTC

head link

[Lustre-discuss] Redundancy on OSS/OST nodes

Ramon van Alteren wrote:> Hi,
>
> I hope this is the correct list, and the question hasn''t been
asked before.
>
> I''ve read through most of the material on the wiki and the website
and I
> am currently in the process of building a proof-of-concept lustre cluster.
> One thing isn''t clear about the aggregated throughput figures in
the
> FAQ: http://www.clusterfs.com/faq.html
>
> Stated throughput on a 64-bit linux OSS in the FAQ is:
>  Dual-NIC gig-e on a 64-bit OSS: 220 MB/s
>
> We''re hoping to use these OSS''s to provide access to a
large collection
> of rather small files.
> Most of them are rendered images in various formats, typical filesize
> range is 1Kb -  100Kb
>
> The total volume is expected to grow well over the 10Tb over time,
we''re
> currently at 4TB.
> We would like to achieve the fastest throughput possible.
>
> If I understood everything correctly parts of a file can / will be
> stored on multiple OSS/OST''s ?
> Because of this aggregated throughput for a single file can be higher
> than the max-throughput per OSS ?
> Wat is the smallest element of a file that can be spread over multiple
> OSS''s/OST''s ?
>
>   You can stripe but with files so small you''ll see no benefit. You
really
don''t want to stripe unless you have to. One thing to watch is your 
metadata inodes with that many files so small. 2TB of disk can store 2 
billion inodes. That means a max of 2 billion files. Since 8TB is the 
max disk size supported by most distros you''re limited to 8 billion 
files in one lustre filesystem. Then you have to start another filesystem.

Daniel
> Best Regards,
>
> Ramon van Alteren
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss@clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>
>

Ramon van Alteren

2007-Feb-05 02:36 UTC

head link

[Lustre-discuss] Redundancy on OSS/OST nodes

Hi Daniel,

Daniel Leaberry wrote:> You can stripe but with files so small you''ll see no benefit. You
> really don''t want to stripe unless you have to. One thing to watch
is
> your metadata inodes with that many files so small. 2TB of disk can
> store 2 billion inodes. That means a max of 2 billion files. Since 8TB
> is the max disk size supported by most distros you''re limited to 8
> billion files in one lustre filesystem. Then you have to start another
> filesystem.Thanks for the reply, but I''m not sure I understand correctly.
Are you telling me that the size of the total lustre filesystem is
limited by the size of the MDS storage filesystem?
Meaning that in order to store 8 billion files in a lustre filesystem I
would need an 8Tb MDS filesystem ?

Or am I reading your answer completely the wrong way ?

Is this limitation over all versions or tied to a specific lustre
version (aka is this true for 1.6 as well )

Regards,

Ramon

Daniel Leaberry

2007-Feb-05 09:20 UTC

head link

[Lustre-discuss] Redundancy on OSS/OST nodes

Ramon van Alteren wrote:> Hi Daniel,
>
> Daniel Leaberry wrote:
>   
>> You can stripe but with files so small you''ll see no benefit.
You
>> really don''t want to stripe unless you have to. One thing to
watch is
>> your metadata inodes with that many files so small. 2TB of disk can
>> store 2 billion inodes. That means a max of 2 billion files. Since 8TB
>> is the max disk size supported by most distros you''re limited
to 8
>> billion files in one lustre filesystem. Then you have to start another
>> filesystem.
>>     
> Thanks for the reply, but I''m not sure I understand correctly.
> Are you telling me that the size of the total lustre filesystem is
> limited by the size of the MDS storage filesystem?
> Meaning that in order to store 8 billion files in a lustre filesystem I
> would need an 8Tb MDS filesystem ?
>   Yes. It''s limited by inodes. If all you create are 10MB files you
won''t
ever be limited by the mds filesystem because you consume 4KB for every 
10MB on the OST''s. If you store small files you''re very much
limited by
the mfs filesystem. Best case scenerio with no striping means you can 
use 1KB inodes. Here''s a 950GB lun formatted with 1K inodes. As you can
see I have 943 million inodes which means I can store 943 million files. 
If all those files are 4KB files on the OST''s then my max filesystem 
size can be no greater than 943 million x 4KB. And because the OST''s by
default are formatted with an inode every 16KB you could run short there 
as well.

[root@lu-mds01 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda2              25G  1.3G   22G   6% /
/dev/sda1            1012M   45M  916M   5% /boot
none                  7.9G     0  7.9G   0% /dev/shm
/dev/sdb              450G  489M  405G   1% /var/mnt/lustre01-mds
[root@lu-mds01 ~]# df -i
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/sda2            3204992   50072 3154920    2% /
/dev/sda1             131616      44  131572    1% /boot
none                 2060261       1 2060260    1% /dev/shm
/dev/sdb             943652864      24 943652840    1% /var/mnt/lustre01-mds

> Or am I reading your answer completely the wrong way ?
>
> Is this limitation over all versions or tied to a specific lustre
> version (aka is this true for 1.6 as well )
>   This is a limitation of all lustre versions. Once disjoint clustered 
MDS''s come then you can have as many files as you like in one
filesystem.> Regards,
>
> Ramon
>

Lustre discuss - Jan 2007 - Redundancy on OSS/OST nodes

[Lustre-discuss] Redundancy on OSS/OST nodes

[Lustre-discuss] Redundancy on OSS/OST nodes

[Lustre-discuss] Redundancy on OSS/OST nodes

[Lustre-discuss] Redundancy on OSS/OST nodes