Hello! I''m new to lustre and I have just one question. While reading literature and playing with my test lustre system I''ve noticed that there is strange (for me) issue with full OST. For instance i copied 2.5 GB file to lustre which had 120 GB storage space (I have 2GB test OSTs) and it didn''t automatically recognized full OST but it simply stopped working with " No space left on device" error message. There was plenty of space left on filesystem (cca 100GB). I''m aware that I can stripe the file over several OSTs but this should be done automatically! If the system detect that one of the OST is full it should put it in offline state automatically. I just cant believe that I have to manualy watch over which OST is getting full and putting it offline like it is described here: http://wiki.lustre.org/index.ph/Handling_Full_OSTs . Can that be automated? Is it already done? Why it is not part of lustre? Some future version? Thank you for your answer! -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100129/e187857f/attachment.html
On Fri, Jan 29, 2010 at 10:32:26AM +0100, gvozden rovina wrote:> OST. For instance i copied 2.5 GB file to lustre which had 120 GB storage > space (I have 2GB test OSTs) and it didn''t automatically recognized full > OST but it simply stopped working with " No space left on device" error > message. There was plenty of space left on filesystem (cca 100GB). I''m > aware that I can stripe the file over several OSTs but this should be done > automatically! If the system detect that one of the OST is full it should > put it in offline state automatically.* I just cant believe that I have to > manualy watch over which OST is getting full and putting it offline like > it is described here:The mds monitors OST disk usage by regularly sending OST_STATFS rpcs and it won''t allocate *new* files on OSTs that are full. This means that you don''t need to put full OSTs offline on the MDS, those OSTs will be skipped automatically at file creation time. That being said, we do *not* migrate existing files stored on full OSTs or increase the stripe count dynamically. The default stripe count is 1 and since your OST size is 2GB, this means that by default, the maximum file size is limited to 2GB (even less with metadata overhead). You can of course change the stripe count with lfs setstripe. Restriping files would require to have the layout lock feature, which is not available in any lustre releases yet. Johann
Hi! Thank you for your swift answer. I have just one more question. Is it possible to configure lustre system so that it writes not just the file but also the copy of the same file in the same time as you create it? thx! On Fri, Jan 29, 2010 at 11:07 AM, Johann Lombardi <johann at sun.com> wrote:> On Fri, Jan 29, 2010 at 10:32:26AM +0100, gvozden rovina wrote: > > OST. For instance i copied 2.5 GB file to lustre which had 120 GB storage > > space (I have 2GB test OSTs) and it didn''t automatically recognized full > > OST but it simply stopped working with " No space left on device" error > > message. There was plenty of space left on filesystem (cca 100GB). I''m > > aware that I can stripe the file over several OSTs but this should be > done > > automatically! If the system detect that one of the OST is full it should > > put it in offline state automatically.* I just cant believe that I have > to > > manualy watch over which OST is getting full and putting it offline like > > it is described here: > > The mds monitors OST disk usage by regularly sending OST_STATFS rpcs and > it won''t allocate *new* files on OSTs that are full. This means that you > don''t need to put full OSTs offline on the MDS, those OSTs will be skipped > automatically at file creation time. > > That being said, we do *not* migrate existing files stored on full OSTs or > increase the stripe count dynamically. The default stripe count is 1 > and since your OST size is 2GB, this means that by default, the maximum > file size is limited to 2GB (even less with metadata overhead). You can > of course change the stripe count with lfs setstripe. > Restriping files would require to have the layout lock feature, which > is not available in any lustre releases yet. > > Johann >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100129/acc4f9ee/attachment.html
On Fri, Jan 29, 2010 at 11:48:58AM +0100, gvozden rovina wrote:> Thank you for your swift answer. I have just one more question. Is it > possible to > configure lustre system so that it writes not just the file but also the > copy of the same file in the same time as you create it?No, we don''t support network RAID-1/mirroring yet. Johann
thx for answer. good bye for now On Fri, Jan 29, 2010 at 12:02 PM, Johann Lombardi <johann at sun.com> wrote:> On Fri, Jan 29, 2010 at 11:48:58AM +0100, gvozden rovina wrote: > > Thank you for your swift answer. I have just one more question. Is it > > possible to > > configure lustre system so that it writes not just the file but also the > > copy of the same file in the same time as you create it? > > No, we don''t support network RAID-1/mirroring yet. > > Johann >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100129/16fdb58f/attachment-0001.html
On 2010-01-29, at 03:07, Johann Lombardi wrote:> On Fri, Jan 29, 2010 at 10:32:26AM +0100, gvozden rovina wrote: >> OST. For instance i copied 2.5 GB file to lustre which had 120 GB >> storage >> space (I have 2GB test OSTs) and it didn''t automatically recognized >> full >> OST but it simply stopped working with " No space left on device" >> error >> message. There was plenty of space left on filesystem (cca 100GB). > > The mds monitors OST disk usage by regularly sending OST_STATFS rpcs > and > it won''t allocate *new* files on OSTs that are full. This means that > you > don''t need to put full OSTs offline on the MDS, those OSTs will be > skipped > automatically at file creation time.Note also that we expect OSTs to be configured with a MUCH larger size than 2GB. Typical is 8TB, and in the near future 16TB OSTs will be possible. The object allocation policy assumes that the individual file size is smaller than single OSTs, and for extremely large files (i.e. multi-TB) the user can set the striping for the file wide enough to have sufficient space. For applications that don''t know how many stripes to use, it is also possible to have the MDS compute this based on the expected file size (assuming the application knows this) and the current OST space availability: mknod({lustre_filename|, S_IFREG, {file_perms}); truncate({lustre_filename}, {expected_size} open({lustre_filename}, {open_mode} When the file is opened, it will be striped widely enough to allow {expected_size} to be written to it, assuming there is enough space on each OST such that: min_ost_free * num_stripes >= expected_size This doesn''t actually _reserve_ that space, so if multiple nodes are writing huge files and there isn''t enough space in the filesystem, you can still run out of space. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.