Hi SunHPC-ers, some comments about what happens when I point my current (non-Sun) oneSIS CentOS 5.3 compute node client image at the SunHPC repos... this is a presumably slightly unsupported, and yet incredibly useful way of using the Sun stack... :-) 1) is it correct for me to be pointing at all 4 repos under http://dlc.sun.com/linux_hpc/yum/sunhpc/2.0/rhel/ ? 2) there is the usual openib vs. kernel-ib rpm conflict over who owns /etc/init.d/openibd - perhaps the Lustre rpm should change the name of their version of the init script. 3) it appears that the OFED in the SunHPC stack (1.3.1) is older than that which comes with CentOS 5.3 (a bit hard to tell but I think it''s 1.3.2). can that be right? presumably it''ll be fixed with the next Lustre version. 4) the SunHPC oneSIS rpm silently overwrites my etc/sysimage.conf file in the image. Ow. you probably can respin that rpm and specify sysimage.conf as a not-to-be-destroyed config file in the spec file... this will presumably hit every SunHPC user when they update to the latest version of the stack. 5) pretty trivial, but the SunHPC oneSIS rpm drags in the packages dhcp syslinux tftp-server which are not relevant for a client image. perhaps you could spin a oneSIS-client and oneSIS-server (or -image and -server) rpm, and just the server rpm has those dependencies. 6) should a recent OpenMPI be in the software stack? or is the Sun improved/approved version available from some other Sun repo? I ask because the OpenMPI 1.2.7 in standard CentOS is ~unusably old and possibly not even IB aware. in reality we tend to build and maintain our own un-packaged multitude of OpenMPI''s, so this is not really important, just a curious omission... cheers, robin -- Dr Robin Humble, HPC Systems Analyst, NCI National Facility
On Jul 14, 2009, at 11:21 PM, Robin Humble wrote:> 6) should a recent OpenMPI be in the software stack? or is the Sun > improved/approved version available from some other Sun repo? I ask > because the OpenMPI 1.2.7 in standard CentOS is ~unusably old and > possibly not even IB aware.It looks as though ClusterTools 8.1 is there: http://dlc.sun.com/linux_hpc/yum/sunhpc/2.0/rhel/base/x86_64/SunHPC/ That is based on OpenMPI pre-1.3 (it doesn''t correspond exactly to any OpenMPI release). Iain
Hi Robin. On Tue, 2009-07-14 at 23:21 -0400, Robin Humble wrote:> some comments about what happens when I point my current (non-Sun) oneSIS > CentOS 5.3 compute node client image at the SunHPC repos... this is a > presumably slightly unsupported, and yet incredibly useful way of using > the Sun stack... :-) >We''re quite interested in hearing any and all reports on stack usage from folks out there in the real world. :) I wouldn''t say that using the online repos from within the OneSIS image is unsupported... Just not documented. We''ve really been trying to walk a line between just providing a pile of the best HPC tools we know of, and providing a fancy integrated pointy-clicky suite of software on rails. Put another way, it is our hope and expectation that "power users" will be able to take what we''ve provided and find new and interesting ways to play with it. This is precisely what the team had in mind when we started thinking about making a distribution of all the best Open Source HPC tools we could find.> 1) is it correct for me to be pointing at all 4 repos under > http://dlc.sun.com/linux_hpc/yum/sunhpc/2.0/rhel/ ? >Of the four repos there, only base/ and updates/ should definitely be enabled. The sunhpc/ repo is only useful if you want a perfctr-patched kernel and its collateral (lustre-client, kernel-ib, etc.). The lustre/ repo is ONLY for Lustre servers (e.g. MDS/OSS), which require a Lustre-patched kernel. For a normal compute image, it should not be enabled.> 2) there is the usual openib vs. kernel-ib rpm conflict over who owns > /etc/init.d/openibd - perhaps the Lustre rpm should change the name > of their version of the init script. >Technically, the kernel-ib RPM comes directly from the OFED ofa-kernel SRPM, not Lustre. I haven''t really looked into this personally, but I suspect that the conflict is due to some vendor massaging of the OFED suite. We just build OFED directly from the upstream OFA tarball, rather than attempting to massage things to fit into the rhel/sles layouts for OFED. One thing that our installer could probably do better is to try harder to remove vendor OFED components before proceeding with installation of the sunhpc OFED stuff. I believe that the next release of the stack''s installer has actually addressed the very issue you''ve raised, but I''ll have to check to be sure.> 3) it appears that the OFED in the SunHPC stack (1.3.1) is older than > that which comes with CentOS 5.3 (a bit hard to tell but I think it''s > 1.3.2). can that be right? presumably it''ll be fixed with the next > Lustre version. >The next release of our stack (2.0.1) will essentially be 2.0 + OFED 1.4.1. There will be some minor updates of other components, but overall functionality should be identical. Unfortunately, OFED 1.4.1 stabilized too late in our release cycle to be included in 2.0.> 4) the SunHPC oneSIS rpm silently overwrites my etc/sysimage.conf file > in the image. Ow. you probably can respin that rpm and specify > sysimage.conf as a not-to-be-destroyed config file in the spec file... > this will presumably hit every SunHPC user when they update to the > latest version of the stack. >Hmm. This seems annoying. I''ve created bug 20176 for this issue. Thanks for reporting it.> 5) pretty trivial, but the SunHPC oneSIS rpm drags in the packages > dhcp syslinux tftp-server > which are not relevant for a client image. perhaps you could spin a > oneSIS-client and oneSIS-server (or -image and -server) rpm, and just > the server rpm has those dependencies. >Yeah. We''ve been thinking a lot about overhauling the whole provisioning setup to allow for more lightweight clients and better customization. There is a lot of unnecessary stuff that winds up being dragged in for compute clients (and Lustre servers, for that matter). I can''t commit to any specific plans at the moment, but it''s definitely on our roadmap.> 6) should a recent OpenMPI be in the software stack? or is the Sun > improved/approved version available from some other Sun repo? I ask > because the OpenMPI 1.2.7 in standard CentOS is ~unusably old and > possibly not even IB aware. > in reality we tend to build and maintain our own un-packaged multitude > of OpenMPI''s, so this is not really important, just a curious omission... >As has been pointed out in another reply to your post, we''ve replaced stock OpenMPI with Sun''s ClusterTools distribution. CT is OpenMPI with additional integration hooks and other goodies. It''s completely unrestricted and free (BSD license). Our 2.0 release includes CT 8.1, and our next release will include CT 8.2. http://www.sun.com/software/products/clustertools/index.xml Best, Mike
I''ll field the simple one: On Jul 14, 2009, at 8:21 PM, Robin Humble wrote:> Hi SunHPC-ers, > > some comments about what happens when I point my current (non-Sun) > oneSIS > CentOS 5.3 compute node client image at the SunHPC repos... this is a > presumably slightly unsupported, and yet incredibly useful way of > using > the Sun stack... :-) > > 1) is it correct for me to be pointing at all 4 repos under > http://dlc.sun.com/linux_hpc/yum/sunhpc/2.0/rhel/ ? > > 2) there is the usual openib vs. kernel-ib rpm conflict over who owns > /etc/init.d/openibd - perhaps the Lustre rpm should change the name > of their version of the init script. > > 3) it appears that the OFED in the SunHPC stack (1.3.1) is older than > that which comes with CentOS 5.3 (a bit hard to tell but I think it''s > 1.3.2). can that be right? presumably it''ll be fixed with the next > Lustre version. > > 4) the SunHPC oneSIS rpm silently overwrites my etc/sysimage.conf file > in the image. Ow. you probably can respin that rpm and specify > sysimage.conf as a not-to-be-destroyed config file in the spec file... > this will presumably hit every SunHPC user when they update to the > latest version of the stack. > > 5) pretty trivial, but the SunHPC oneSIS rpm drags in the packages > dhcp syslinux tftp-server > which are not relevant for a client image. perhaps you could spin a > oneSIS-client and oneSIS-server (or -image and -server) rpm, and just > the server rpm has those dependencies. > > 6) should a recent OpenMPI be in the software stack? or is the Sun > improved/approved version available from some other Sun repo? I ask > because the OpenMPI 1.2.7 in standard CentOS is ~unusably old and > possibly not even IB aware. > in reality we tend to build and maintain our own un-packaged multitude > of OpenMPI''s, so this is not really important, just a curious > omission...Yes it is there as Sun Cluster Tools. -rw-rw-r-- 1 root root 8815453 Apr 28 11:27 clustertools_gcc-8.1- sunhpc8.x86_64.rpm -rw-rw-r-- 1 root root 9316660 Apr 24 14:53 clustertools_intel-8.1- sunhpc7.x86_64.rpm -rw-rw-r-- 1 root root 4882398 Apr 24 15:13 clustertools_pathscale-8.1-sunhpc7.x86_64.rpm -rw-rw-r-- 1 root root 6685784 Apr 24 14:28 clustertools_pgi-8.1- sunhpc7.x86_64.rpm -rw-rw-r-- 1 root root 4590613 Apr 28 13:23 clustertools_sunstudio-8.1-sunhpc8.x86_64.rpm You can also get the just-released 8.2 here: http://www.sun.com/software/products/clustertools/get_it.jsp -frank> > cheers, > robin > -- > Dr Robin Humble, HPC Systems Analyst, NCI National Facility > > _______________________________________________ > Linux_hpc_swstack mailing list > Linux_hpc_swstack at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/linux_hpc_swstack