Hey Guys, I have been having a lot of trouble with the sles 11 kernel that lustre 1.8.4 supports. I tried downgrading the kernel and the lustre client (1.8.1.1) but the kernel-ib provided modules had some problems with my hca. The newer kernel-ib package that comes with lustre client 1.8.4 did not have any issues. Would it be possible for me to build/install my own ofed package and install just the lustre client rpms? Do you see any issues with this setup? Or should lustre be built from scratch all together? Thanks, -J -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20101027/0542fcf8/attachment.html
On Oct 27, 2010, at 1:37 PM, Jagga Soorma wrote:> I have been having a lot of trouble with the sles 11 kernel that lustre 1.8.4 supports. I tried downgrading the kernel and the lustre client (1.8.1.1) but the kernel-ib provided modules had some problems with my hca. The newer kernel-ib package that comes with lustre client 1.8.4 did not have any issues. Would it be possible for me to build/install my own ofed package and install just the lustre client rpms? Do you see any issues with this setup? Or should lustre be built from scratch all together?Jagga, We use our own OFED RPMs with Lustre clients using the Lustre client RPMs. We also have some clients compiled from source due to the fact that we can''t run stock kernels on some of our hardware. Both work fine. -mb -- +----------------------------------------------- | Michael Barnes | | Thomas Jefferson National Accelerator Facility | Scientific Computing Group | 12000 Jefferson Ave. | Newport News, VA 23606 | (757) 269-7634 +-----------------------------------------------
Thanks Michael for your response. So if I understand correctly, you have not had any issues running the stock kernel with the sun/oracle provided lustre client rpms and instead of using the kernel-ib package you install your own ofed packages. Also, I have the new intel 8 core cpu''s and would prefer to go to sles 11 sp 1 instead of sles 11. However, this is not supported by the lustre client yet. What has your experience been with building your own lustre rpm''s from source using a different kernel? Do you still have to patch the kernel? I am also thinking about installing sles 11 sp1 and just building the lustre client rpm''s from source. Not sure if it is required to patch the kernel if I use the most updated version provided my sles 11 sp1. Thanks again, -J On Wed, Oct 27, 2010 at 10:49 AM, Michael Barnes <Michael.Barnes at jlab.org>wrote:> > On Oct 27, 2010, at 1:37 PM, Jagga Soorma wrote: > > > I have been having a lot of trouble with the sles 11 kernel that lustre > 1.8.4 supports. I tried downgrading the kernel and the lustre client > (1.8.1.1) but the kernel-ib provided modules had some problems with my hca. > The newer kernel-ib package that comes with lustre client 1.8.4 did not > have any issues. Would it be possible for me to build/install my own ofed > package and install just the lustre client rpms? Do you see any issues with > this setup? Or should lustre be built from scratch all together? > > Jagga, > > We use our own OFED RPMs with Lustre clients using the Lustre client RPMs. > We also have some clients compiled from source due to the fact that we > can''t run stock kernels on some of our hardware. > > Both work fine. > > -mb > > -- > +----------------------------------------------- > | Michael Barnes > | > | Thomas Jefferson National Accelerator Facility > | Scientific Computing Group > | 12000 Jefferson Ave. > | Newport News, VA 23606 > | (757) 269-7634 > +----------------------------------------------- > > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20101027/4eae7460/attachment.html
On Oct 27, 2010, at 1:56 PM, Jagga Soorma wrote:> Thanks Michael for your response. So if I understand correctly, you have not had any issues running the stock kernel with the sun/oracle provided lustre client rpms and instead of using the kernel-ib package you install your own ofed packages.Thats correct.> Also, I have the new intel 8 core cpu''s and would prefer to go to sles 11 sp 1 instead of sles 11. However, this is not supported by the lustre client yet. What has your experience been with building your own lustre rpm''s from source using a different kernel? Do you still have to patch the kernel? I am also thinking about installing sles 11 sp1 and just building the lustre client rpm''s from source. Not sure if it is required to patch the kernel if I use the most updated version provided my sles 11 sp1.No. Lustre client kernel modules are self-contained aka "patchless" clients. Its been a while since I made the RPMs, but I found this laying around: ./configure --disable-server --with-linux=/usr/src/linux-2.6.22-pfm-xeon --with-o2ib --enable-quota --disable-readline Then I believe ''make rpms'' does the right thing. Now that I said how easy it was, there is a caveat. Now, there may be issues with specific kernels, but this worked for us. The linux-2.6.22 kernel is a kernel.org kernel with pfm patches (performance monitoring) and this kernel also has a NDAed patch from AMD because there are bugs in the CPUs and the patches are workarounds for the bugs in the CPU. It works for us, YMMV. -mb -- +----------------------------------------------- | Michael Barnes | | Thomas Jefferson National Accelerator Facility | Scientific Computing Group | 12000 Jefferson Ave. | Newport News, VA 23606 | (757) 269-7634 +-----------------------------------------------
Michael, Which source should I be downloading from oracle''s site? There seem to be different client source RPM''s based on the distribution. I would have expected just a single source tarball or src.rpm but that does not seem to be the case. My apologies for the n00b question but I have not built the lustre client from src before. Thanks, -J On Wed, Oct 27, 2010 at 11:15 AM, Michael Barnes <Michael.Barnes at jlab.org>wrote:> > On Oct 27, 2010, at 1:56 PM, Jagga Soorma wrote: > > > Thanks Michael for your response. So if I understand correctly, you have > not had any issues running the stock kernel with the sun/oracle provided > lustre client rpms and instead of using the kernel-ib package you install > your own ofed packages. > > Thats correct. > > > Also, I have the new intel 8 core cpu''s and would prefer to go to sles 11 > sp 1 instead of sles 11. However, this is not supported by the lustre > client yet. What has your experience been with building your own lustre > rpm''s from source using a different kernel? Do you still have to patch the > kernel? I am also thinking about installing sles 11 sp1 and just building > the lustre client rpm''s from source. Not sure if it is required to patch > the kernel if I use the most updated version provided my sles 11 sp1. > > No. Lustre client kernel modules are self-contained aka "patchless" > clients. Its been a while since I made the RPMs, but I found this laying > around: > > ./configure --disable-server --with-linux=/usr/src/linux-2.6.22-pfm-xeon > --with-o2ib --enable-quota --disable-readline > > Then I believe ''make rpms'' does the right thing. > > Now that I said how easy it was, there is a caveat. Now, there may be > issues with specific kernels, but this worked for us. The linux-2.6.22 > kernel is a kernel.org kernel with pfm patches (performance monitoring) > and this kernel also has a NDAed patch from AMD because there are bugs in > the CPUs and the patches are workarounds for the bugs in the CPU. > > It works for us, YMMV. > > -mb > > -- > +----------------------------------------------- > | Michael Barnes > | > | Thomas Jefferson National Accelerator Facility > | Scientific Computing Group > | 12000 Jefferson Ave. > | Newport News, VA 23606 > | (757) 269-7634 > +----------------------------------------------- > > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20101027/be2c2fe5/attachment.html
Okay, I think I need lustre-1.8.4.tar.gz. Will try building the client with it and install my own ofed package. Hope this works. Thanks, -J On Wed, Oct 27, 2010 at 11:36 AM, Jagga Soorma <jagga13 at gmail.com> wrote:> Michael, > > Which source should I be downloading from oracle''s site? There seem to be > different client source RPM''s based on the distribution. I would have > expected just a single source tarball or src.rpm but that does not seem to > be the case. > > My apologies for the n00b question but I have not built the lustre client > from src before. > > Thanks, > -J > > > On Wed, Oct 27, 2010 at 11:15 AM, Michael Barnes <Michael.Barnes at jlab.org>wrote: > >> >> On Oct 27, 2010, at 1:56 PM, Jagga Soorma wrote: >> >> > Thanks Michael for your response. So if I understand correctly, you >> have not had any issues running the stock kernel with the sun/oracle >> provided lustre client rpms and instead of using the kernel-ib package you >> install your own ofed packages. >> >> Thats correct. >> >> > Also, I have the new intel 8 core cpu''s and would prefer to go to sles >> 11 sp 1 instead of sles 11. However, this is not supported by the lustre >> client yet. What has your experience been with building your own lustre >> rpm''s from source using a different kernel? Do you still have to patch the >> kernel? I am also thinking about installing sles 11 sp1 and just building >> the lustre client rpm''s from source. Not sure if it is required to patch >> the kernel if I use the most updated version provided my sles 11 sp1. >> >> No. Lustre client kernel modules are self-contained aka "patchless" >> clients. Its been a while since I made the RPMs, but I found this laying >> around: >> >> ./configure --disable-server --with-linux=/usr/src/linux-2.6.22-pfm-xeon >> --with-o2ib --enable-quota --disable-readline >> >> Then I believe ''make rpms'' does the right thing. >> >> Now that I said how easy it was, there is a caveat. Now, there may be >> issues with specific kernels, but this worked for us. The linux-2.6.22 >> kernel is a kernel.org kernel with pfm patches (performance monitoring) >> and this kernel also has a NDAed patch from AMD because there are bugs in >> the CPUs and the patches are workarounds for the bugs in the CPU. >> >> It works for us, YMMV. >> >> -mb >> >> -- >> +----------------------------------------------- >> | Michael Barnes >> | >> | Thomas Jefferson National Accelerator Facility >> | Scientific Computing Group >> | 12000 Jefferson Ave. >> | Newport News, VA 23606 >> | (757) 269-7634 >> +----------------------------------------------- >> >> >> >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20101027/72d446b7/attachment.html
Just out of curiosity, how come you are using the --disable-readline option? Thanks, -J On Wed, Oct 27, 2010 at 11:41 AM, Jagga Soorma <jagga13 at gmail.com> wrote:> Okay, I think I need lustre-1.8.4.tar.gz. Will try building the client > with it and install my own ofed package. Hope this works. > > Thanks, > -J > > > On Wed, Oct 27, 2010 at 11:36 AM, Jagga Soorma <jagga13 at gmail.com> wrote: > >> Michael, >> >> Which source should I be downloading from oracle''s site? There seem to be >> different client source RPM''s based on the distribution. I would have >> expected just a single source tarball or src.rpm but that does not seem to >> be the case. >> >> My apologies for the n00b question but I have not built the lustre client >> from src before. >> >> Thanks, >> -J >> >> >> On Wed, Oct 27, 2010 at 11:15 AM, Michael Barnes <Michael.Barnes at jlab.org >> > wrote: >> >>> >>> On Oct 27, 2010, at 1:56 PM, Jagga Soorma wrote: >>> >>> > Thanks Michael for your response. So if I understand correctly, you >>> have not had any issues running the stock kernel with the sun/oracle >>> provided lustre client rpms and instead of using the kernel-ib package you >>> install your own ofed packages. >>> >>> Thats correct. >>> >>> > Also, I have the new intel 8 core cpu''s and would prefer to go to sles >>> 11 sp 1 instead of sles 11. However, this is not supported by the lustre >>> client yet. What has your experience been with building your own lustre >>> rpm''s from source using a different kernel? Do you still have to patch the >>> kernel? I am also thinking about installing sles 11 sp1 and just building >>> the lustre client rpm''s from source. Not sure if it is required to patch >>> the kernel if I use the most updated version provided my sles 11 sp1. >>> >>> No. Lustre client kernel modules are self-contained aka "patchless" >>> clients. Its been a while since I made the RPMs, but I found this laying >>> around: >>> >>> ./configure --disable-server --with-linux=/usr/src/linux-2.6.22-pfm-xeon >>> --with-o2ib --enable-quota --disable-readline >>> >>> Then I believe ''make rpms'' does the right thing. >>> >>> Now that I said how easy it was, there is a caveat. Now, there may be >>> issues with specific kernels, but this worked for us. The linux-2.6.22 >>> kernel is a kernel.org kernel with pfm patches (performance monitoring) >>> and this kernel also has a NDAed patch from AMD because there are bugs in >>> the CPUs and the patches are workarounds for the bugs in the CPU. >>> >>> It works for us, YMMV. >>> >>> -mb >>> >>> -- >>> +----------------------------------------------- >>> | Michael Barnes >>> | >>> | Thomas Jefferson National Accelerator Facility >>> | Scientific Computing Group >>> | 12000 Jefferson Ave. >>> | Newport News, VA 23606 >>> | (757) 269-7634 >>> +----------------------------------------------- >>> >>> >>> >>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20101027/758b43e0/attachment.html
On Oct 27, 2010, at 2:46 PM, Jagga Soorma wrote:> Just out of curiosity, how come you are using the --disable-readline option?Too lazy to install the readline source package, and I don''t think I needed it. No other reason. -mb -- +----------------------------------------------- | Michael Barnes | | Thomas Jefferson National Accelerator Facility | Scientific Computing Group | 12000 Jefferson Ave. | Newport News, VA 23606 | (757) 269-7634 +-----------------------------------------------
Hey Michael, The configure process went find. However, after doing a make install and rebooting the server I am not able to load the lustre module even though it does exisit: -- node205:~ # modprobe lustre FATAL: Module lustre not found. node205:~ # cd /lib node205:/lib # find . -name "lustre.ko" ./modules/2.6.32.12-0.7-default/updates/kernel/fs/lustre/lustre.ko node205:/lib # uname -a Linux node205 2.6.32.12-0.7-default #1 SMP 2010-05-20 11:14:20 +0200 x86_64 x86_64 x86_64 GNU/Linux -- I am probably missing something here. Any help would be appreciated. Thanks, -J On Wed, Oct 27, 2010 at 11:59 AM, Michael Barnes <Michael.Barnes at jlab.org>wrote:> > On Oct 27, 2010, at 2:46 PM, Jagga Soorma wrote: > > > Just out of curiosity, how come you are using the --disable-readline > option? > > Too lazy to install the readline source package, and I don''t think I needed > it. No other reason. > > -mb > > -- > +----------------------------------------------- > | Michael Barnes > | > | Thomas Jefferson National Accelerator Facility > | Scientific Computing Group > | 12000 Jefferson Ave. > | Newport News, VA 23606 > | (757) 269-7634 > +----------------------------------------------- > > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20101027/d3d7449a/attachment.html
Okay, so I ran a depmod and then tried again. I am now running into these error messages: -- node205:/etc/modprobe.d # modprobe lustre WARNING: Error inserting osc (/lib/modules/2.6.32.12-0.7-default/updates/kernel/fs/lustre/osc.ko): Input/output error WARNING: Error inserting mdc (/lib/modules/2.6.32.12-0.7-default/updates/kernel/fs/lustre/mdc.ko): Input/output error WARNING: Error inserting lov (/lib/modules/2.6.32.12-0.7-default/updates/kernel/fs/lustre/lov.ko): Input/output error FATAL: Error inserting lustre (/lib/modules/2.6.32.12-0.7-default/updates/kernel/fs/lustre/lustre.ko): Input/output error -- What am I missing here? Thanks, -J On Wed, Oct 27, 2010 at 3:00 PM, Jagga Soorma <jagga13 at gmail.com> wrote:> Hey Michael, > > The configure process went find. However, after doing a make install and > rebooting the server I am not able to load the lustre module even though it > does exisit: > > -- > node205:~ # modprobe lustre > FATAL: Module lustre not found. > node205:~ # cd /lib > node205:/lib # find . -name "lustre.ko" > ./modules/2.6.32.12-0.7-default/updates/kernel/fs/lustre/lustre.ko > node205:/lib # uname -a > Linux node205 2.6.32.12-0.7-default #1 SMP 2010-05-20 11:14:20 +0200 x86_64 > x86_64 x86_64 GNU/Linux > -- > > I am probably missing something here. Any help would be appreciated. > > Thanks, > -J > > > > On Wed, Oct 27, 2010 at 11:59 AM, Michael Barnes <Michael.Barnes at jlab.org>wrote: > >> >> On Oct 27, 2010, at 2:46 PM, Jagga Soorma wrote: >> >> > Just out of curiosity, how come you are using the --disable-readline >> option? >> >> Too lazy to install the readline source package, and I don''t think I >> needed >> it. No other reason. >> >> -mb >> >> -- >> +----------------------------------------------- >> | Michael Barnes >> | >> | Thomas Jefferson National Accelerator Facility >> | Scientific Computing Group >> | 12000 Jefferson Ave. >> | Newport News, VA 23606 >> | (757) 269-7634 >> +----------------------------------------------- >> >> >> >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20101027/2345e0ea/attachment.html
>Okay, so I ran a depmod and then tried again. I am now running into these >error messages: >-- >node205:/etc/modprobe.d # modprobe lustre >WARNING: Error inserting osc >(/lib/modules/2.6.32.12-0.7-default/updates/kernel/fs/lustre/osc.ko): >Input/output errorYou should in /var/log/messages or in dmesg; that should show you the "real" error. I suspect there will be a lot of them, so you probably want to go back to find the first error messages which will likely be the most useful one. If I had to take a guess with my crystal ball ... I think your problem might be symbol version mismatches between the version of OFED Lustre was compiled against versus the version of OFED that is installed. --Ken