Hi, It has come to our attention that Lustre is occasionally found to be difficult to configure and manage. To address this, we are exploring options for improving our configuration tools for Lustre, including replacing lustre_config completely. However, we want to engage our user community as much as possible in this effort, since that help ensure the tools we end up with will be a good fit for those who use it the most. For instance, the new tool "Shine" (CEA''s rewrite of Bull''s admin tools) looks very promising and I encourage everyone interested in this to check it out: http://shine-wiki.async.eu/wiki/ Home. Does anyone know of other tools or efforts underway to build one? Ideally, I''d like to concentrate efforts on one solution, and declare this to be the "official" tool. On related note, what do people use to monitor Lustre? Of course there is LMT (http://sourceforge.net/projects/lmt/), and I''ve heard reports of others using nagios and/or ganglia, as well. Anything else? Should we start documenting these on a "best practices" wiki page? cheers, robert
Robert, Here is some feedback from our perspective. Our deployments are probably not big enough to care too much about configuration tools and we don''t use any fancy networking or routers. Saying that, I''m all for nice ways to monitor Lustre specific metrics and so I suppose the config tools also serve to define your configuration which the monitoring tools can then use. I''d also vote for a monitoring webpage over a native GUI (if that is up for discussion). I remember the old Lustre Manager webpage and wished that it had been continued! We currently monitor our Lustre server metrics using an inhouse daemon (similar to Ganlia) which is then plotted using Cacti. It doesn''t give us much in the way of instantaneous feedback but we tend to be more interested in trends anyway. One thing that may or may not be useful (!) would be be a Lustre yum repository. Then perhaps we can get upgrades and recommended software versions automatically and easily. It would be great for client upgrades (patchless) if we had a Lustre akmod RPM package (or similar) which would automatically build the modules for the current kernel (like the Nvidia modules in rpmfusion). You could even go the whole way and create a Lustre distro based on CentOS and provide a CD for download (e.g. AsteriskNow). I appreciate this may not be of any use at all to the big labs but if you want wider adoption of Lustre then it might be useful.> It has come to our attention that Lustre is occasionally found to be > difficult to configure and manage. To address this, we are exploring > options for improving our configuration tools for Lustre, including > replacing lustre_config completely. However, we want to engage our > user community as much as possible in this effort, since that help > ensure the tools we end up with will be a good fit for those who use > it the most. For instance, the new tool "Shine" (CEA''s rewrite of > Bull''s admin tools) looks very promising and I encourage everyone > interested in this to check it out: http://shine-wiki.async.eu/wiki/ > Home. Does anyone know of other tools or efforts underway to build > one? Ideally, I''d like to concentrate efforts on one solution, and > declare this to be the "official" tool. > > On related note, what do people use to monitor Lustre? Of course > there is LMT (http://sourceforge.net/projects/lmt/), and I''ve heard > reports of others using nagios and/or ganglia, as well. Anything else?We looked at LMT but it was a bit of a pain to configure tbh. It obviously was designed specifically for LLNL''s environment and relied on many things that we simply don''t use. Hopefully some of this is useful! Daire
On Mon, 2009-02-23 at 12:07 +0000, Daire Byrne wrote:> Robert,Hi Daire,> One thing that may or may not be useful (!) would be be a Lustre yum > repository.I''m afraid our current Lustre distribution policy is distribution through the SDLC only.> Then perhaps we can get upgrades and recommended software > versions automatically and easily.Yeah, catalogued repositories would make maintenance easier, indeed.> You could even go the whole way and create a > Lustre distro based on CentOS and provide a CD for download (e.g. > AsteriskNow).We do. You are describing the Sun HPC software stack. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090223/3b97ec1b/attachment.bin