Ron
2008-Feb-22 18:59 UTC
[Lustre-discuss] 2.6.23 client systems with any compatible server
I want to have a lustre client running on a system with 2.6.23.12 kernel. (The reason is that there is a special patch that is required for these 60+ Quad-Core AMD Opteron systems that we have and the patch is currently only available for this 2.6.23.12 kernel). Does anyone have a recommendation of how I should get a client and then a compatible server? For the server, we only need minimal throughput, we just would like to see if Lustre can manage a filesystem created on a 40 TB disk system attached via a qlogic fibre channel adapter. We were planning to run the mgs, mdt, and several OSTs on a single system. There are no kernel constraints for the server. (We currently have a redhat base x86_64 distribution loaded.) I have tried a lustre CVS client (20080116) for the 2.6.23.12, but seem to have run into a compatibility issue with the 1.6.4.*- vanilla_2.6.18.8 server. Any recommendation for any 2.6.23.12 client and ??? server I should try next? Thanks, Ron
Canon, Richard Shane
2008-Feb-23 03:32 UTC
[Lustre-discuss] 2.6.23 client systems with any compatible server
Ron, I''m trying a slightly different strategy. I''m trying to backport the TLB patch back to a RHEL5 kernel (which would work on SL5 too). If I get this patch working, I can let you know. This may be easier than trying to get Lustre running on a 2.6.23 system. --Shane -----Original Message----- From: lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Ron Sent: Friday, February 22, 2008 2:00 PM To: lustre-discuss at clusterfs.com Cc: ron at fnal.gov Subject: [Lustre-discuss] 2.6.23 client systems with any compatible server I want to have a lustre client running on a system with 2.6.23.12 kernel. (The reason is that there is a special patch that is required for these 60+ Quad-Core AMD Opteron systems that we have and the patch is currently only available for this 2.6.23.12 kernel). Does anyone have a recommendation of how I should get a client and then a compatible server? For the server, we only need minimal throughput, we just would like to see if Lustre can manage a filesystem created on a 40 TB disk system attached via a qlogic fibre channel adapter. We were planning to run the mgs, mdt, and several OSTs on a single system. There are no kernel constraints for the server. (We currently have a redhat base x86_64 distribution loaded.) I have tried a lustre CVS client (20080116) for the 2.6.23.12, but seem to have run into a compatibility issue with the 1.6.4.*- vanilla_2.6.18.8 server. Any recommendation for any 2.6.23.12 client and ??? server I should try next? Thanks, Ron _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Andreas Dilger
2008-Feb-25 04:19 UTC
[Lustre-discuss] 2.6.23 client systems with any compatible server
On Feb 22, 2008 10:59 -0800, Ron wrote:> I want to have a lustre client running on a system with 2.6.23.12 > kernel. (The reason is that there is a special patch that is required > for these 60+ Quad-Core AMD Opteron systems that we have and the patch > is currently only available for this 2.6.23.12 kernel). > > Does anyone have a recommendation of how I should get a client and > then a compatible server? > For the server, we only need minimal throughput, we just would like to > see if Lustre can manage a filesystem created on a 40 TB disk system > attached via a qlogic fibre channel adapter. We were planning to run > the mgs, mdt, and several OSTs on a single system. There are no kernel > constraints for the server. (We currently have a redhat base x86_64 > distribution loaded.) > > I have tried a lustre CVS client (20080116) for the 2.6.23.12, but > seem to have run into a compatibility issue with the 1.6.4.*- > vanilla_2.6.18.8 server.You should be able to build a "patchless" lustre client for kernels > 2.6.16, though I''m not sure if we have all of the kernel API changes for > 2.6.22 in the Lustre code. The issue is that even though Lustre clients no longer require kernel patches, the kernel APIs change without notice or documentation, so there is always _something_ broken when a new kernel is released. Could you also elaborate on the "1.6.4.* compatibility issue"? There shouldn''t be any compatibility problems between 1.6 releases, though the current b1_6 development branch has a feature (adaptive timeouts) which is likely to be removed before the final release. I would suggest getting the specific Lustre release you want by CVS tag (v1_6_4_3 probably) instead of the CVS tip. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
Jim Garlick
2008-Feb-25 16:24 UTC
[Lustre-discuss] 2.6.23 client systems with any compatible server
On Sun, Feb 24, 2008 at 09:19:27PM -0700, Andreas Dilger wrote:> > Could you also elaborate on the "1.6.4.* compatibility issue"? There > shouldn''t be any compatibility problems between 1.6 releases, though > the current b1_6 development branch has a feature (adaptive timeouts) > which is likely to be removed before the final release. I would suggest > getting the specific Lustre release you want by CVS tag (v1_6_4_3 probably) > instead of the CVS tip.Andreas, are you saying adaptive timeouts are likely to be deferred to 1.8? Jim
Andreas Dilger
2008-Feb-25 19:45 UTC
[Lustre-discuss] 2.6.23 client systems with any compatible server
On Feb 25, 2008 08:24 -0800, Jim Garlick wrote:> On Sun, Feb 24, 2008 at 09:19:27PM -0700, Andreas Dilger wrote: > > Could you also elaborate on the "1.6.4.* compatibility issue"? There > > shouldn''t be any compatibility problems between 1.6 releases, though > > the current b1_6 development branch has a feature (adaptive timeouts) > > which is likely to be removed before the final release. I would suggest > > getting the specific Lustre release you want by CVS tag (v1_6_4_3 probably) > > instead of the CVS tip. > > Andreas, are you saying adaptive timeouts are likely to be deferred to 1.8?The recent testing of AT showed quite bad behaviour, so it cannot be released as-is. We will have more capability for testing this internally very soon, and I think that LLNL is also moving to test a newer version of Lustre so this should allow resolving the issues with adaptive timeouts more quickly. For the 1.6.5 release we plan to disable AT only with a configure option instead of reverting the code entirely (unless of course we get a lot of testing and some bug fixes very soon). This will allow testing AT more easily. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
Jim Garlick
2008-Feb-26 16:09 UTC
[Lustre-discuss] 2.6.23 client systems with any compatible server
On Mon, Feb 25, 2008 at 11:45:14AM -0800, Andreas Dilger wrote:> On Feb 25, 2008 08:24 -0800, Jim Garlick wrote: > > On Sun, Feb 24, 2008 at 09:19:27PM -0700, Andreas Dilger wrote: > > > Could you also elaborate on the "1.6.4.* compatibility issue"? There > > > shouldn''t be any compatibility problems between 1.6 releases, though > > > the current b1_6 development branch has a feature (adaptive timeouts) > > > which is likely to be removed before the final release. I would suggest > > > getting the specific Lustre release you want by CVS tag (v1_6_4_3 probably) > > > instead of the CVS tip. > > > > Andreas, are you saying adaptive timeouts are likely to be deferred to 1.8? > > The recent testing of AT showed quite bad behaviour, so it cannot be > released as-is. We will have more capability for testing this internally > very soon, and I think that LLNL is also moving to test a newer version > of Lustre so this should allow resolving the issues with adaptive timeouts > more quickly. > > For the 1.6.5 release we plan to disable AT only with a configure option > instead of reverting the code entirely (unless of course we get a lot of > testing and some bug fixes very soon). This will allow testing AT more > easily.We are keen to get AT working and we anticipated dedicating the bulk of our test resources and people to stablizing and hardening a release including AT over the next four months. Is there a projected date for 1.6.5 release, or would it be possible to modify 1.6 plans to include a 1.6.6 release that does not release until AT works? We would very much like to see it in a 1.6 release as opposed to 1.8 which includes so many other new things. Jim
Andreas Dilger
2008-Feb-27 20:02 UTC
[Lustre-discuss] 2.6.23 client systems with any compatible server
On Feb 26, 2008 08:09 -0800, Jim Garlick wrote:> On Mon, Feb 25, 2008 at 11:45:14AM -0800, Andreas Dilger wrote: > > The recent testing of AT showed quite bad behaviour, so it cannot be > > released as-is. We will have more capability for testing this internally > > very soon, and I think that LLNL is also moving to test a newer version > > of Lustre so this should allow resolving the issues with adaptive timeouts > > more quickly. > > > > For the 1.6.5 release we plan to disable AT only with a configure option > > instead of reverting the code entirely (unless of course we get a lot of > > testing and some bug fixes very soon). This will allow testing AT more > > easily. > > We are keen to get AT working and we anticipated dedicating the bulk of > our test resources and people to stablizing and hardening a release including > AT over the next four months. > > Is there a projected date for 1.6.5 release, or would it be possible to > modify 1.6 plans to include a 1.6.6 release that does not release until AT > works?There is some room for including AT into the 1.6.5 release, but that would only happen if there were good at-scale testing results in the next week or two. If it takes 2 months to stabilize AT then we would rather make a 1.6.5 release in the meantime and have AT in 1.6.6 (or whenever). Delaying the fixes already slated for 1.6.5 further than needed doesn''t benefit anyone. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.