I see that 1.8.2 supports 16T OSTs for RHEL. Does anyone know when this will be supported for SLES? Is anyone currently using a 16T OST, who could share their experiences? Is it stable? Thanks. Roger Spellman Staff Engineer Terascala, Inc. 508-588-1501 www.terascala.com <http://www.terascala.com/> -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100209/da12574d/attachment.html
On 2010-02-09, at 15:02, Roger Spellman wrote:> I see that 1.8.2 supports 16T OSTs for RHEL. > > Does anyone know when this will be supported for SLES?No, it will not, because SLES doesn''t provide a very uptodate ext4 code, and a number of 16TB fixes went into ext4 late in the game. RHEL5.4, on the other hand, has very uptodate ext4 code and the RHEL ext4 maintainer is one of the ext4 maintainers himself.> Is anyone currently using a 16T OST, who could share their > experiences? Is it stable?I believe a few large customers are already testing/using this. I''ll let them speak for themselves. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
On Tuesday 09 February 2010, Roger Spellman wrote:> I see that 1.8.2 supports 16T OSTs for RHEL.A bit OT (doesn''t relate specifically to sles) but relates to 16T support. We currently have 1T drives in 10+2 configuration yielding 10 TB arrays. Today we have to split this into two logical parts to get under 8T. That is, today we could have used 16T support. Tomorrow, however, we''ll likely have 2T drives in 10+2 configuration and a 16T lustre will be just as limiting as 8T is today. So, if you''re using ext4 to go beyond 8T why not go further than 16T? /Peter ...> Is anyone currently using a 16T OST, who could share their experiences? > Is it stable?-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part. Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100210/35b28aee/attachment.bin
On 2010-02-10, at 07:39, Roger Spellman wrote:> Thank you. Based on the kernel version string, we had assumed that > SLES > was closer to the latest kernel.org release than RHEL. That appears > not > to be the case. > > Just curious, why the limit is now 16T? This works nicely for 2T > drives > in an 8+2 RAID 6. But, is there a reason that the limit couldn''t be > much higher, say 64T or 256T?Two reasons for this: - primarily, the upstream e2fsprogs does not yet have full support for >16TB filesystems, and while experimental patches exist there are still bugs being found occasionally in that code - there is a certain amount of testing that we need to do before we can say that Lustre supports that configuration That said, with 1.8.2 it is still possible to format the filesystem with the experimental 64-bit e2fsprogs, and mount the OSTs with "-o force_over_16tb" and test this out yourselves. Feedback is of course welcome. I would suggest running "llverfs" on the mounted Lustre filesystem (or other tool which can post-facto verify the data being written) to completely fill an OST, probably unmount/remount it to clear any cache, and then read the data back and ensure that you are getting the correct data back. Running an "e2fsck -f" on the OST would also help verify the on-disk filesystem structure. At some point we will likely conduct this same testing and "officially" support this configuration, but it wasn''t done for 1.8.2. At some point, the e2fsck overhead of a large ext4/ldiskfs filesystem becomes too high to support huge configurations (e.g. larger than, say, 128TB if even that). While ext4 and e2fsprogs have gotten a lot of improvements to speed up e2fsck time, there is a limit to what can be done with this.>> -----Original Message----- >> From: Andreas.Dilger at sun.com [mailto:Andreas.Dilger at sun.com] On >> Behalf > Of >> Andreas Dilger >> Sent: Tuesday, February 09, 2010 7:13 PM >> To: Roger Spellman >> Cc: lustre-discuss at lists.lustre.org >> Subject: Re: [Lustre-discuss] 16T LUNs >> >> On 2010-02-09, at 15:02, Roger Spellman wrote: >>> I see that 1.8.2 supports 16T OSTs for RHEL. >>> >>> Does anyone know when this will be supported for SLES? >> >> No, it will not, because SLES doesn''t provide a very uptodate ext4 >> code, and a number of 16TB fixes went into ext4 late in the game. >> RHEL5.4, on the other hand, has very uptodate ext4 code and the RHEL >> ext4 maintainer is one of the ext4 maintainers himself. >> >>> Is anyone currently using a 16T OST, who could share their >>> experiences? Is it stable? >> >> >> I believe a few large customers are already testing/using this. I''ll >> let them speak for themselves. >> >> Cheers, Andreas >> -- >> Andreas Dilger >> Sr. Staff Engineer, Lustre Group >> Sun Microsystems of Canada, Inc. >Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
On Wed, Feb 10, 2010 at 02:41:55PM -0700, Andreas Dilger wrote:> On 2010-02-10, at 07:39, Roger Spellman wrote: > > Thank you. Based on the kernel version string, we had assumed that > > SLES > > was closer to the latest kernel.org release than RHEL. That appears > > not > > to be the case. > > > > Just curious, why the limit is now 16T? This works nicely for 2T > > drives > > in an 8+2 RAID 6. But, is there a reason that the limit couldn''t be > > much higher, say 64T or 256T? > > Two reasons for this: > - primarily, the upstream e2fsprogs does not yet have full support for > >16TB > filesystems, and while experimental patches exist there are still > bugs > being found occasionally in that code > - there is a certain amount of testing that we need to do before we > can say > that Lustre supports that configurationWe are interested in testing OSTs larger than 16 TB. I found a public repository for 64-bit e2fsprogs at git://git.kernel.org/pub/scm/fs/ext2/val/e2fsprogs.git Is this were I''d find the patches you refer to? Would I apply them against the 1.8.2 distribution''s e2fsprogs sources? David Simas SLAC> > That said, with 1.8.2 it is still possible to format the filesystem > with the experimental 64-bit e2fsprogs, and mount the OSTs with "-o > force_over_16tb" and test this out yourselves. Feedback is of course > welcome. I would suggest running "llverfs" on the mounted Lustre > filesystem (or other tool which can post-facto verify the data being > written) to completely fill an OST, probably unmount/remount it to > clear any cache, and then read the data back and ensure that you are > getting the correct data back. Running an "e2fsck -f" on the OST > would also help verify the on-disk filesystem structure. > > At some point we will likely conduct this same testing and > "officially" support this configuration, but it wasn''t done for > 1.8.2. At some point, the e2fsck overhead of a large ext4/ldiskfs > filesystem becomes too high to support huge configurations (e.g. > larger than, say, 128TB if even that). While ext4 and e2fsprogs have > gotten a lot of improvements to speed up e2fsck time, there is a limit > to what can be done with this. > > >> -----Original Message----- > >> From: Andreas.Dilger at sun.com [mailto:Andreas.Dilger at sun.com] On > >> Behalf > > Of > >> Andreas Dilger > >> Sent: Tuesday, February 09, 2010 7:13 PM > >> To: Roger Spellman > >> Cc: lustre-discuss at lists.lustre.org > >> Subject: Re: [Lustre-discuss] 16T LUNs > >> > >> On 2010-02-09, at 15:02, Roger Spellman wrote: > >>> I see that 1.8.2 supports 16T OSTs for RHEL. > >>> > >>> Does anyone know when this will be supported for SLES? > >> > >> No, it will not, because SLES doesn''t provide a very uptodate ext4 > >> code, and a number of 16TB fixes went into ext4 late in the game. > >> RHEL5.4, on the other hand, has very uptodate ext4 code and the RHEL > >> ext4 maintainer is one of the ext4 maintainers himself. > >> > >>> Is anyone currently using a 16T OST, who could share their > >>> experiences? Is it stable? > >> > >> > >> I believe a few large customers are already testing/using this. I''ll > >> let them speak for themselves. > >> > >> Cheers, Andreas > >> -- > >> Andreas Dilger > >> Sr. Staff Engineer, Lustre Group > >> Sun Microsystems of Canada, Inc. > > > > > Cheers, Andreas > -- > Andreas Dilger > Sr. Staff Engineer, Lustre Group > Sun Microsystems of Canada, Inc. > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
On 2010-02-10, at 17:29, David Simas wrote:> On Wed, Feb 10, 2010 at 02:41:55PM -0700, Andreas Dilger wrote: >> >> - primarily, the upstream e2fsprogs does not yet have full support >> for >> > 16TB filesystems, and while experimental patches exist there >> are still >> bugs being found occasionally in that code >> - there is a certain amount of testing that we need to do before we >> can say that Lustre supports that configuration > > We are interested in testing OSTs larger than 16 TB. I found a > public repository for 64-bit e2fsprogs at > > git://git.kernel.org/pub/scm/fs/ext2/val/e2fsprogs.git > > Is this were I''d find the patches you refer to?This is an outdated version of the 64-bit patches. The updated 64-bit patches are part of the main e2fsprogs git repo in: http://repo.or.cz/w/e2fsprogs.git This is in a state of flux since 1.41.10 was released, since the 64- bit patches are slated to be part of the 1.42 release. I believe that most of the 64-bit patches are in the "pu" branch (proposed updates).> Would I apply them against the 1.8.2 distribution''s e2fsprogs sources?For 64-bit testing purposes, I think you can use the upstream 64-bit mke2fs without any of the lustre-specific changes. Those are mostly related to making e2fsck more robust, and adding support for lfsck and such.>> That said, with 1.8.2 it is still possible to format the filesystem >> with the experimental 64-bit e2fsprogs, and mount the OSTs with "-o >> force_over_16tb" and test this out yourselves. Feedback is of course >> welcome. I would suggest running "llverfs" on the mounted Lustre >> filesystem (or other tool which can post-facto verify the data being >> written) to completely fill an OST, probably unmount/remount it to >> clear any cache, and then read the data back and ensure that you are >> getting the correct data back. Running an "e2fsck -f" on the OST >> would also help verify the on-disk filesystem structure. >> >> At some point we will likely conduct this same testing and >> "officially" support this configuration, but it wasn''t done for >> 1.8.2. At some point, the e2fsck overhead of a large ext4/ldiskfs >> filesystem becomes too high to support huge configurations (e.g. >> larger than, say, 128TB if even that). While ext4 and e2fsprogs have >> gotten a lot of improvements to speed up e2fsck time, there is a >> limit >> to what can be done with this. >> >>>> -----Original Message----- >>>> From: Andreas.Dilger at sun.com [mailto:Andreas.Dilger at sun.com] On >>>> Behalf >>> Of >>>> Andreas Dilger >>>> Sent: Tuesday, February 09, 2010 7:13 PM >>>> To: Roger Spellman >>>> Cc: lustre-discuss at lists.lustre.org >>>> Subject: Re: [Lustre-discuss] 16T LUNs >>>> >>>> On 2010-02-09, at 15:02, Roger Spellman wrote: >>>>> I see that 1.8.2 supports 16T OSTs for RHEL. >>>>> >>>>> Does anyone know when this will be supported for SLES? >>>> >>>> No, it will not, because SLES doesn''t provide a very uptodate ext4 >>>> code, and a number of 16TB fixes went into ext4 late in the game. >>>> RHEL5.4, on the other hand, has very uptodate ext4 code and the >>>> RHEL >>>> ext4 maintainer is one of the ext4 maintainers himself. >>>> >>>>> Is anyone currently using a 16T OST, who could share their >>>>> experiences? Is it stable? >>>> >>>> >>>> I believe a few large customers are already testing/using this. >>>> I''ll >>>> let them speak for themselves. >>>> >>>> Cheers, Andreas >>>> -- >>>> Andreas Dilger >>>> Sr. Staff Engineer, Lustre Group >>>> Sun Microsystems of Canada, Inc. >>> >> >> >> Cheers, Andreas >> -- >> Andreas Dilger >> Sr. Staff Engineer, Lustre Group >> Sun Microsystems of Canada, Inc. >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discussCheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.