thr3ads.net - Lustre discuss - [Lustre-discuss] Problem with lustre 2.0.0.1, ext3/4 and big OSTs (>8Tb) [Mar 2011]

If this information is useful, please help other people find it:
Share via:

Joan J. Piles

2011-Mar-15 15:02 UTC

[Lustre-discuss] Problem with lustre 2.0.0.1, ext3/4 and big OSTs (>8Tb)

Hi,

We are trying to set up a lustre 2.0.0.1 (the most recent one 
downladable from the offiecial site) installation. We plan to have some 
big OSTs (~ 12Tb), using ScientificLinux 5.5 (which should be a RHEL 
clone for all purposes).

However, when we try to format the OSTs, we get the following error:
> [root at oss01 ~]# mkfs.lustre --ost --fsname=extra 
> --mgsnode=172.16.4.4 at tcp0 --mkfsoptions ''-i 262144 -E 
> stride=32,stripe_width=192 '' /dev/sde
>
>    Permanent disk data:
> Target:     extra-OSTffff
> Index:      unassigned
> Lustre FS:  extra
> Mount type: ldiskfs
> Flags:      0x72
>               (OST needs_index first_time update )
> Persistent mount opts: errors=remount-ro,extents,mballoc
> Parameters: mgsnode=172.16.4.4 at tcp
>
> checking for existing Lustre data: not found
> device size = 11427830MB
> formatting backing filesystem ldiskfs on /dev/sde
>     target name  extra-OSTffff
>     4k blocks     2925524480
>     options       -i 262144 -E stride=32,stripe_width=192  -J size=400 
> -I 256 -q -O dir_index,extents,uninit_bg -F
> mkfs_cmd = mke2fs -j -b 4096 -L extra-OSTffff -i 262144 -E 
> stride=32,stripe_width=192  -J size=400 -I 256 -q -O 
> dir_index,extents,uninit_bg -F /dev/sde 2925524480
> mkfs.lustre: Unable to mount /dev/sde: Invalid argument
>
> mkfs.lustre FATAL: failed to write local files
> mkfs.lustre: exiting with 22 (Invalid argument)

In the dmesg log, we find the following line:
> LDISKFS-fs does not support filesystems greater than 8TB and can cause 
> data corruption.Use "force_over_8tb" mount option to override.
After some investigation, we find it is related to the use of ext3 
instead of ext4, even though we should be using ext4, proven by the fact 
that the file systems created are actually ext4:
> [root at oss01 ~]# file -s /dev/sde
> /dev/sde: Linux rev 1.0 ext4 filesystem data (extents) (large files)
Further, we made a test with an ext3 filesystem in the same machine, and 
the difference is found:
> [root at oss01 ~]# file -s /dev/sda1
> /dev/sda1: Linux rev 1.0 ext3 filesystem data (large files)
Everything we found in the net about this problem seems to refer to 
lustre 1.8.5. However, we would not expect such a regression in lustre 
2. Is this actually a problem with lustre 2? Has ext4 to be enabled 
either at compile time or with a parameter somewhere (we found no 
documentation about it)?

Greetings and thanks,


-- 
--------------------------------------------------------------------------
Joan Josep Piles Contreras -  Analista de sistemas
I3A - Instituto de Investigaci?n en Ingenier?a de Arag?n
Tel: 976 76 10 00 (ext. 5454)
http://i3a.unizar.es -- jpiles at unizar.es
--------------------------------------------------------------------------

Kevin Van Maren

2011-Mar-15 15:22 UTC

head link

[Lustre-discuss] Problem with lustre 2.0.0.1, ext3/4 and big OSTs (>8Tb)

Joan J. Piles wrote:> Hi,
>
> We are trying to set up a lustre 2.0.0.1 (the most recent one 
> downladable from the offiecial site) installation. We plan to have some 
> big OSTs (~ 12Tb), using ScientificLinux 5.5 (which should be a RHEL 
> clone for all purposes).
>
> However, when we try to format the OSTs, we get the following error:
>
>   
>> [root at oss01 ~]# mkfs.lustre --ost --fsname=extra 
>> --mgsnode=172.16.4.4 at tcp0 --mkfsoptions ''-i 262144 -E 
>> stride=32,stripe_width=192 '' /dev/sde
>>
>>    Permanent disk data:
>> Target:     extra-OSTffff
>> Index:      unassigned
>> Lustre FS:  extra
>> Mount type: ldiskfs
>> Flags:      0x72
>>               (OST needs_index first_time update )
>> Persistent mount opts: errors=remount-ro,extents,mballoc
>> Parameters: mgsnode=172.16.4.4 at tcp
>>
>> checking for existing Lustre data: not found
>> device size = 11427830MB
>> formatting backing filesystem ldiskfs on /dev/sde
>>     target name  extra-OSTffff
>>     4k blocks     2925524480
>>     options       -i 262144 -E stride=32,stripe_width=192  -J size=400 
>> -I 256 -q -O dir_index,extents,uninit_bg -F
>> mkfs_cmd = mke2fs -j -b 4096 -L extra-OSTffff -i 262144 -E 
>> stride=32,stripe_width=192  -J size=400 -I 256 -q -O 
>> dir_index,extents,uninit_bg -F /dev/sde 2925524480
>> mkfs.lustre: Unable to mount /dev/sde: Invalid argument
>>
>> mkfs.lustre FATAL: failed to write local files
>> mkfs.lustre: exiting with 22 (Invalid argument)
>>     
>
>
> In the dmesg log, we find the following line:
>
>   
>> LDISKFS-fs does not support filesystems greater than 8TB and can cause 
>> data corruption.Use "force_over_8tb" mount option to
override.
>>     
>
> After some investigation, we find it is related to the use of ext3 
> instead of ext4, 
Correct.
> even though we should be using ext4, proven by the fact 
> that the file systems created are actually ext4:
>
>   
>> [root at oss01 ~]# file -s /dev/sde
>> /dev/sde: Linux rev 1.0 ext4 filesystem data (extents) (large files)
>>     
No, these are "ldiskfs" filesystems.  ext3+ldiskfs looks a bit like
ext4
(ext4 is largely based on the
enhancements done for Lustre''s ldiskfs), but is not the same as 
ext4+ldiskfs.  In particular, file system
size is limited to 8TB, not 16TB.
> Further, we made a test with an ext3 filesystem in the same machine, and 
> the difference is found:
>
>   
>> [root at oss01 ~]# file -s /dev/sda1
>> /dev/sda1: Linux rev 1.0 ext3 filesystem data (large files)
>>     
>
> Everything we found in the net about this problem seems to refer to 
> lustre 1.8.5. However, we would not expect such a regression in lustre 
> 2. Is this actually a problem with lustre 2? Has ext4 to be enabled 
> either at compile time or with a parameter somewhere (we found no 
> documentation about it)?
>   
Lustre 2.0 did not enable ext4 by default, due to known issues.  You can 
rebuild the Lustre server,
with "--enable-ext4" on the configure line, to enable it.  But if you 
are going to use 12TB LUNs,
you should either sick with v1.8.5 (stable), or pull a newer version 
from git (experimental).

Kevin

Joan J. Piles

2011-Mar-15 16:36 UTC

head link

[Lustre-discuss] Problem with lustre 2.0.0.1, ext3/4 and big OSTs (>8Tb)

We have tried recompiling ldiskfs with ext4 enabled, and so far it seems 
to create the file systems without any further problem.

The only known issue we found is in the Release Notes:
> Enabling ext 4 allows LUNs larger than 8 TB to be used in the Lustre 
> file system.
> When ext4 is enabled, by default, in a system at scale, servers become 
> overloaded
> (cause unknown). This results in clients timing out and attempting to 
> reconnect,
> an action which the server does not accept. Eventually, the server 
> evicts the client
> due to a lock timeout.
> Workaround: Do not enable ext4 in Lustre 2.0.0.
What number of clients "a system at scale" means? We are expecting to 
have at most 1500 processes in 150 nodes accessing the filesystem. Is 
this big enough to trigger the issue?

Since is is going to be production system, using an experimental version 
is out of question. Should we sitck to 1.8 and forget about 2.0? Shall 
there be soon a 2.0.0.x release adressing these issues?

Thanks,

El 15/03/11 16:22, Kevin Van Maren escribi?:>
> Lustre 2.0 did not enable ext4 by default, due to known issues.  You 
> can rebuild the Lustre server,
> with "--enable-ext4" on the configure line, to enable it.  But if
you
> are going to use 12TB LUNs,
> you should either sick with v1.8.5 (stable), or pull a newer version 
> from git (experimental).
>
> Kevin
>
>

-- 
--------------------------------------------------------------------------
Joan Josep Piles Contreras -  Analista de sistemas
I3A - Instituto de Investigaci?n en Ingenier?a de Arag?n
Tel: 976 76 10 00 (ext. 5454)
http://i3a.unizar.es -- jpiles at unizar.es
--------------------------------------------------------------------------

Lustre discuss - Mar 2011 - Problem with lustre 2.0.0.1, ext3/4 and big OSTs (>8Tb)

[Lustre-discuss] Problem with lustre 2.0.0.1, ext3/4 and big OSTs (>8Tb)

[Lustre-discuss] Problem with lustre 2.0.0.1, ext3/4 and big OSTs (>8Tb)

[Lustre-discuss] Problem with lustre 2.0.0.1, ext3/4 and big OSTs (>8Tb)