Carlos Santana
2009-Jun-15 21:07 UTC
[Lustre-discuss] Lustre installation and configuration problems
Hello list, I am struggling to install Lustre 1.8 on a CentOS 5.2 box. I am referring to Lustre manual http://manual.lustre.org/index.php?title=Main_Page and Lustre HowTo http://wiki.lustre.org/index.php/Lustre_Howto guide. Following is the installation order and warning/error messages (if any) associated with it. - kernel-lustre patch - luster-module: http://www.heypasteit.com/clip/8UJ - lustre-ldiskfs http://www.heypasteit.com/clip/8UK - lustre-utilities - e2fsprogs: http://www.heypasteit.com/clip/8UL I did not see any test examples under /usr/lib/lustre/examples directory as mentioned in the HowTo document. In fact, I do not have ''examples'' dir at all. So I skipped to http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Toolssection. But I did not have lmc, lconf, and lctl commands either. Any clues on how should I proceed with installation and configuration? Is there any guide for step-by-step installation? Feedback/comments welcome. Thanks, CS. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090615/69e933b7/attachment.html
Arden Wiebe
2009-Jun-15 22:16 UTC
[Lustre-discuss] Lustre installation and configuration problems
Carlos: I''m not clear on which kernel package you tried to install. There is pretty much a set order to install the packages from my understanding of the wording in the manual. From experience: rpm -ivh kernel-lustre-smp-2.6.18-92.1.17.el5_lustre.1.8.0.x86_64.rpm rpm -ivh lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp.x86_64.rpm rpm -ivh lustre-ldiskfs-3.0.8-2.6.18_92.1.17.el5_lustre.1.8.0smp.x86_64.rpm rpm -ivh lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp.x86_64.rpm rpm -Uvh e2fsprogs-1.40.11.sun1-0redhat.rhel5.x86_64.rpm Hope that helps as that order has worked for me many times. Arden --- On Mon, 6/15/09, Carlos Santana <neubyr at gmail.com> wrote:> From: Carlos Santana <neubyr at gmail.com> > Subject: [Lustre-discuss] Lustre installation and configuration problems > To: lustre-discuss at lists.lustre.org > Date: Monday, June 15, 2009, 2:07 PM > Hello list, > > I am struggling to install Lustre 1.8 on a CentOS 5.2 box. > I am referring to Lustre manual? > http://manual.lustre.org/index.php?title=Main_Page > and Lustre HowTo http://wiki.lustre.org/index.php/Lustre_Howto > guide. Following is the installation order and warning/error > messages (if any) associated with it. > > ?- kernel-lustre patch > ?- luster-module: http://www.heypasteit.com/clip/8UJ > > ?- lustre-ldiskfs http://www.heypasteit.com/clip/8UK > > > ?- lustre-utilities > ?- e2fsprogs: http://www.heypasteit.com/clip/8UL > > > I did not see any test examples under > /usr/lib/lustre/examples directory as mentioned in the HowTo > document. In fact, I do not have ''examples'' dir at > all. So I skipped to http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools > section. But I did not have lmc, lconf, and lctl commands > either. Any clues on how should I proceed with installation > and configuration? Is there any guide for step-by-step > installation? Feedback/comments welcome. > > > Thanks, > CS.? > > > -----Inline Attachment Follows----- > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >
Carlos Santana
2009-Jun-16 15:52 UTC
[Lustre-discuss] Lustre installation and configuration problems
Thanks for the update Sheila. I am using manual for Lustre 1.8 (May-09). Arden, as per the 1.8 manual: --- --- Install the kernel, modules and ldiskfs packages. Use the rpm -ivh command to install the kernel, module and ldiskfs packages. For example: $ rpm -ivh kernel-lustre-smp-<ver> \ kernel-ib-<ver> \ lustre-modules-<ver> \ lustre-ldiskfs-<ver> c. Install the utilities/userspace packages. Use the rpm -ivh command to install the utilities packages. For example: $ rpm -ivh lustre-<ver> d. Install the e2fsprogs package. Use the rpm -i command to install the e2fsprogs package. For example: $ rpm -i e2fsprogs-<ver> If you want to add any optional packages to your Lustre file system, install them now. 4. Verify that the boot loader (grub.conf or lilo.conf) has --- --- I followed the same order. The lconf and lmc are not available on my system. I am not sure what are they and when will I need it. I continued to explore other things in lustre and have created MDS and OST mount points on the same system. I have installed lustre client on a separate machine and when I tried to mount lustre MGS on it, I received following error: --- --- [root at localhost ~]# mount -t lustre 10.0.0.42 at tcp0:/lustre /mnt/lustre mount.lustre: mount 10.0.0.42 at tcp0:/lustre at /mnt/lustre failed: No such device Are the lustre modules loaded? Check /etc/modprobe.conf and /proc/filesystems Note ''alias lustre llite'' should be removed from modprobe.conf --- --- The modprobe on client says, ''module lustre not found''. Any clues? Client: Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun 10 18:49:47 EDT 2008 i686 i686 i386 GNU/Linux MDS/OST: Linux localhost.localdomain 2.6.18-92.1.17.el5_lustre.1.8.0smp #1 SMP Wed Feb 18 18:40:54 MST 2009 i686 i686 i386 GNU/Linux Thanks, CS. On Mon, Jun 15, 2009 at 5:16 PM, Arden Wiebe <albert682 at yahoo.com> wrote:> > Carlos: > > I''m not clear on which kernel package you tried to install. There is > pretty much a set order to install the packages from my understanding of the > wording in the manual. From experience: > > rpm -ivh kernel-lustre-smp-2.6.18-92.1.17.el5_lustre.1.8.0.x86_64.rpm > rpm -ivh lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp.x86_64.rpm > rpm -ivh lustre-ldiskfs-3.0.8-2.6.18_92.1.17.el5_lustre.1.8.0smp.x86_64.rpm > rpm -ivh lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp.x86_64.rpm > rpm -Uvh e2fsprogs-1.40.11.sun1-0redhat.rhel5.x86_64.rpm > > Hope that helps as that order has worked for me many times. > > Arden > > > --- On Mon, 6/15/09, Carlos Santana <neubyr at gmail.com> wrote: > > > From: Carlos Santana <neubyr at gmail.com> > > Subject: [Lustre-discuss] Lustre installation and configuration problems > > To: lustre-discuss at lists.lustre.org > > Date: Monday, June 15, 2009, 2:07 PM > > Hello list, > > > > I am struggling to install Lustre 1.8 on a CentOS 5.2 box. > > I am referring to Lustre manual > > http://manual.lustre.org/index.php?title=Main_Page > > and Lustre HowTo http://wiki.lustre.org/index.php/Lustre_Howto > > guide. Following is the installation order and warning/error > > messages (if any) associated with it. > > > > - kernel-lustre patch > > - luster-module: http://www.heypasteit.com/clip/8UJ > > > > - lustre-ldiskfs http://www.heypasteit.com/clip/8UK > > > > > > - lustre-utilities > > - e2fsprogs: http://www.heypasteit.com/clip/8UL > > > > > > I did not see any test examples under > > /usr/lib/lustre/examples directory as mentioned in the HowTo > > document. In fact, I do not have ''examples'' dir at > > all. So I skipped to > http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools > > section. But I did not have lmc, lconf, and lctl commands > > either. Any clues on how should I proceed with installation > > and configuration? Is there any guide for step-by-step > > installation? Feedback/comments welcome. > > > > > > Thanks, > > CS. > > > > > > -----Inline Attachment Follows----- > > > > _______________________________________________ > > Lustre-discuss mailing list > > Lustre-discuss at lists.lustre.org > > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090616/48045110/attachment-0001.html
Kevin Van Maren
2009-Jun-16 16:36 UTC
[Lustre-discuss] Lustre installation and configuration problems
I think lconf and lmc went away with Lustre 1.6. Are you sure you are looking at the 1.8 manual, and not directions for 1.4? /usr/sbin/lctl should be in the lustre-<version> RPM. Do a: # rpm -q -l lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp Do make sure the modules are installed in the right place: # cd /lib/modules/`uname -r` # find . | grep lustre.ko If it shows up, then do: # lustre_rmmod # depmod and try again. Otherwise, figure out where your modules are installed: # uname -r # cd /lib/modules # find . | grep lustre.ko You can also double-check the NID. On the MSD server, do # lctl list_nids Should show 10.0.0.42 at tcp0 Kevin Carlos Santana wrote:> Thanks for the update Sheila. I am using manual for Lustre 1.8 (May-09). > > Arden, as per the 1.8 manual: > --- --- > Install the kernel, modules and ldiskfs packages. > Use the rpm -ivh command to install the kernel, module and ldiskfs > packages. For example: > $ rpm -ivh kernel-lustre-smp-<ver> \ > kernel-ib-<ver> \ > lustre-modules-<ver> \ > lustre-ldiskfs-<ver> > c. Install the utilities/userspace packages. > Use the rpm -ivh command to install the utilities packages. For example: > $ rpm -ivh lustre-<ver> > d. Install the e2fsprogs package. > Use the rpm -i command to install the e2fsprogs package. For example: > $ rpm -i e2fsprogs-<ver> > If you want to add any optional packages to your Lustre file system, > install them > now. > 4. Verify that the boot loader (grub.conf or lilo.conf) has > --- --- > I followed the same order. > > > The lconf and lmc are not available on my system. I am not sure what > are they and when will I need it. I continued to explore other things > in lustre and have created MDS and OST mount points on the same > system. I have installed lustre client on a separate machine and when > I tried to mount lustre MGS on it, I received following error: > > --- --- > [root at localhost ~]# mount -t lustre 10.0.0.42 at tcp0:/lustre /mnt/lustre > mount.lustre: mount 10.0.0.42 at tcp0:/lustre at /mnt/lustre failed: No > such device > Are the lustre modules loaded? > Check /etc/modprobe.conf and /proc/filesystems > Note ''alias lustre llite'' should be removed from modprobe.conf > --- --- > > > The modprobe on client says, ''module lustre not found''. Any clues? > > Client: Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun 10 > 18:49:47 EDT 2008 i686 i686 i386 GNU/Linux > MDS/OST: Linux localhost.localdomain > 2.6.18-92.1.17.el5_lustre.1.8.0smp #1 SMP Wed Feb 18 18:40:54 MST 2009 > i686 i686 i386 GNU/Linux > > Thanks, > CS. > > > > On Mon, Jun 15, 2009 at 5:16 PM, Arden Wiebe <albert682 at yahoo.com > <mailto:albert682 at yahoo.com>> wrote: > > > Carlos: > > I''m not clear on which kernel package you tried to install. There > is pretty much a set order to install the packages from my > understanding of the wording in the manual. From experience: > > rpm -ivh kernel-lustre-smp-2.6.18-92.1.17.el5_lustre.1.8.0.x86_64.rpm > rpm -ivh lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp.x86_64.rpm > rpm -ivh > lustre-ldiskfs-3.0.8-2.6.18_92.1.17.el5_lustre.1.8.0smp.x86_64.rpm > rpm -ivh > lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp.x86_64.rpm > rpm -Uvh e2fsprogs-1.40.11.sun1-0redhat.rhel5.x86_64.rpm > > Hope that helps as that order has worked for me many times. > > Arden > > > --- On Mon, 6/15/09, Carlos Santana <neubyr at gmail.com > <mailto:neubyr at gmail.com>> wrote: > > > From: Carlos Santana <neubyr at gmail.com <mailto:neubyr at gmail.com>> > > Subject: [Lustre-discuss] Lustre installation and configuration > problems > > To: lustre-discuss at lists.lustre.org > <mailto:lustre-discuss at lists.lustre.org> > > Date: Monday, June 15, 2009, 2:07 PM > > Hello list, > > > > I am struggling to install Lustre 1.8 on a CentOS 5.2 box. > > I am referring to Lustre manual > > http://manual.lustre.org/index.php?title=Main_Page > > and Lustre HowTo http://wiki.lustre.org/index.php/Lustre_Howto > > guide. Following is the installation order and warning/error > > messages (if any) associated with it. > > > > - kernel-lustre patch > > - luster-module: http://www.heypasteit.com/clip/8UJ > > > > - lustre-ldiskfs http://www.heypasteit.com/clip/8UK > > > > > > - lustre-utilities > > - e2fsprogs: http://www.heypasteit.com/clip/8UL > > > > > > I did not see any test examples under > > /usr/lib/lustre/examples directory as mentioned in the HowTo > > document. In fact, I do not have ''examples'' dir at > > all. So I skipped to > http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools > > section. But I did not have lmc, lconf, and lctl commands > > either. Any clues on how should I proceed with installation > > and configuration? Is there any guide for step-by-step > > installation? Feedback/comments welcome. > > > > > > Thanks, > > CS. > > > > > > -----Inline Attachment Follows----- > > > > _______________________________________________ > > Lustre-discuss mailing list > > Lustre-discuss at lists.lustre.org > <mailto:Lustre-discuss at lists.lustre.org> > > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >
Carlos Santana
2009-Jun-16 16:58 UTC
[Lustre-discuss] Lustre installation and configuration problems
Thanks Kevin.. I am referring to 1.8 manual, but I was also referring to HowTo page on wiki which seems to be for 1.6. The HowTo page http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Toolsmentions abt lmc, lconf, and lctl. The modules are installed in the right place. The ''$ lustre_rmmod'' resulted in following o/p: [root at localhost 2.6.18-92.1.17.el5_lustre.1.8.0smp]# lustre_rmmod ERROR: Module obdfilter is in use ERROR: Module ost is in use ERROR: Module mds is in use ERROR: Module fsfilt_ldiskfs is in use ERROR: Module mgs is in use ERROR: Module mgc is in use by mgs ERROR: Module ldiskfs is in use by fsfilt_ldiskfs ERROR: Module lov is in use ERROR: Module lquota is in use by obdfilter,mds ERROR: Module osc is in use ERROR: Module ksocklnd is in use ERROR: Module ptlrpc is in use by obdfilter,ost,mds,mgs,mgc,lov,lquota,osc ERROR: Module obdclass is in use by obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc ERROR: Module lnet is in use by ksocklnd,ptlrpc,obdclass ERROR: Module lvfs is in use by obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass ERROR: Module libcfs is in use by obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs Do I need to shutdown these services? How can I do that? Thanks, CS. On Tue, Jun 16, 2009 at 11:36 AM, Kevin Van Maren <Kevin.Vanmaren at sun.com>wrote:> I think lconf and lmc went away with Lustre 1.6. Are you sure you are > looking at the 1.8 manual, and not directions for 1.4? > > /usr/sbin/lctl should be in the lustre-<version> RPM. Do a: > # rpm -q -l lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > > > Do make sure the modules are installed in the right place: > # cd /lib/modules/`uname -r` > # find . | grep lustre.ko > > If it shows up, then do: > # lustre_rmmod > # depmod > and try again. > > Otherwise, figure out where your modules are installed: > # uname -r > # cd /lib/modules > # find . | grep lustre.ko > > > You can also double-check the NID. On the MSD server, do > # lctl list_nids > > Should show 10.0.0.42 at tcp0 > > Kevin > > > Carlos Santana wrote: > >> Thanks for the update Sheila. I am using manual for Lustre 1.8 (May-09). >> >> Arden, as per the 1.8 manual: >> --- --- >> Install the kernel, modules and ldiskfs packages. >> Use the rpm -ivh command to install the kernel, module and ldiskfs >> packages. For example: >> $ rpm -ivh kernel-lustre-smp-<ver> \ >> kernel-ib-<ver> \ >> lustre-modules-<ver> \ >> lustre-ldiskfs-<ver> >> c. Install the utilities/userspace packages. >> Use the rpm -ivh command to install the utilities packages. For example: >> $ rpm -ivh lustre-<ver> >> d. Install the e2fsprogs package. >> Use the rpm -i command to install the e2fsprogs package. For example: >> $ rpm -i e2fsprogs-<ver> >> If you want to add any optional packages to your Lustre file system, >> install them >> now. >> 4. Verify that the boot loader (grub.conf or lilo.conf) has >> --- --- >> I followed the same order. >> >> >> The lconf and lmc are not available on my system. I am not sure what are >> they and when will I need it. I continued to explore other things in lustre >> and have created MDS and OST mount points on the same system. I have >> installed lustre client on a separate machine and when I tried to mount >> lustre MGS on it, I received following error: >> >> --- --- >> [root at localhost ~]# mount -t lustre 10.0.0.42 at tcp0:/lustre /mnt/lustre >> mount.lustre: mount 10.0.0.42 at tcp0:/lustre at /mnt/lustre failed: No such >> device >> Are the lustre modules loaded? >> Check /etc/modprobe.conf and /proc/filesystems >> Note ''alias lustre llite'' should be removed from modprobe.conf >> --- --- >> >> >> The modprobe on client says, ''module lustre not found''. Any clues? >> >> Client: Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun 10 >> 18:49:47 EDT 2008 i686 i686 i386 GNU/Linux >> MDS/OST: Linux localhost.localdomain 2.6.18-92.1.17.el5_lustre.1.8.0smp #1 >> SMP Wed Feb 18 18:40:54 MST 2009 i686 i686 i386 GNU/Linux >> >> Thanks, >> CS. >> >> >> >> On Mon, Jun 15, 2009 at 5:16 PM, Arden Wiebe <albert682 at yahoo.com<mailto: >> albert682 at yahoo.com>> wrote: >> >> >> Carlos: >> >> I''m not clear on which kernel package you tried to install. There >> is pretty much a set order to install the packages from my >> understanding of the wording in the manual. From experience: >> >> rpm -ivh kernel-lustre-smp-2.6.18-92.1.17.el5_lustre.1.8.0.x86_64.rpm >> rpm -ivh lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp.x86_64.rpm >> rpm -ivh >> lustre-ldiskfs-3.0.8-2.6.18_92.1.17.el5_lustre.1.8.0smp.x86_64.rpm >> rpm -ivh >> lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp.x86_64.rpm >> rpm -Uvh e2fsprogs-1.40.11.sun1-0redhat.rhel5.x86_64.rpm >> >> Hope that helps as that order has worked for me many times. >> >> Arden >> >>-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090616/2bffc244/attachment.html
Cliff White
2009-Jun-16 17:16 UTC
[Lustre-discuss] Lustre installation and configuration problems
Carlos Santana wrote:> Thanks Kevin.. >Please read: http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529 Those instructions are identical for 1.6 and 1.8. For current lustre, only two commands are used for configuration. mkfs.lustre and mount. Usually when lustre_rmmod returns that error, you run it a second time, and it will clear things. Unless you have live mounts or network connections. cliffw> I am referring to 1.8 manual, but I was also referring to HowTo page on > wiki which seems to be for 1.6. The HowTo page > http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools > mentions abt lmc, lconf, and lctl. > > The modules are installed in the right place. The ''$ lustre_rmmod'' > resulted in following o/p: > [root at localhost 2.6.18-92.1.17.el5_lustre.1.8.0smp]# lustre_rmmod > ERROR: Module obdfilter is in use > ERROR: Module ost is in use > ERROR: Module mds is in use > ERROR: Module fsfilt_ldiskfs is in use > ERROR: Module mgs is in use > ERROR: Module mgc is in use by mgs > ERROR: Module ldiskfs is in use by fsfilt_ldiskfs > ERROR: Module lov is in use > ERROR: Module lquota is in use by obdfilter,mds > ERROR: Module osc is in use > ERROR: Module ksocklnd is in use > ERROR: Module ptlrpc is in use by obdfilter,ost,mds,mgs,mgc,lov,lquota,osc > ERROR: Module obdclass is in use by > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc > ERROR: Module lnet is in use by ksocklnd,ptlrpc,obdclass > ERROR: Module lvfs is in use by > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass > ERROR: Module libcfs is in use by > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs > > Do I need to shutdown these services? How can I do that? > > Thanks, > CS. > > > On Tue, Jun 16, 2009 at 11:36 AM, Kevin Van Maren > <Kevin.Vanmaren at sun.com <mailto:Kevin.Vanmaren at sun.com>> wrote: > > I think lconf and lmc went away with Lustre 1.6. Are you sure you > are looking at the 1.8 manual, and not directions for 1.4? > > /usr/sbin/lctl should be in the lustre-<version> RPM. Do a: > # rpm -q -l lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > > > Do make sure the modules are installed in the right place: > # cd /lib/modules/`uname -r` > # find . | grep lustre.ko > > If it shows up, then do: > # lustre_rmmod > # depmod > and try again. > > Otherwise, figure out where your modules are installed: > # uname -r > # cd /lib/modules > # find . | grep lustre.ko > > > You can also double-check the NID. On the MSD server, do > # lctl list_nids > > Should show 10.0.0.42 at tcp0 > > Kevin > > > Carlos Santana wrote: > > Thanks for the update Sheila. I am using manual for Lustre 1.8 > (May-09). > > Arden, as per the 1.8 manual: > --- --- > Install the kernel, modules and ldiskfs packages. > Use the rpm -ivh command to install the kernel, module and ldiskfs > packages. For example: > $ rpm -ivh kernel-lustre-smp-<ver> \ > kernel-ib-<ver> \ > lustre-modules-<ver> \ > lustre-ldiskfs-<ver> > c. Install the utilities/userspace packages. > Use the rpm -ivh command to install the utilities packages. For > example: > $ rpm -ivh lustre-<ver> > d. Install the e2fsprogs package. > Use the rpm -i command to install the e2fsprogs package. For > example: > $ rpm -i e2fsprogs-<ver> > If you want to add any optional packages to your Lustre file > system, install them > now. > 4. Verify that the boot loader (grub.conf or lilo.conf) has > --- --- > I followed the same order. > > > The lconf and lmc are not available on my system. I am not sure > what are they and when will I need it. I continued to explore > other things in lustre and have created MDS and OST mount points > on the same system. I have installed lustre client on a separate > machine and when I tried to mount lustre MGS on it, I received > following error: > > --- --- > [root at localhost ~]# mount -t lustre 10.0.0.42 at tcp0:/lustre > /mnt/lustre > mount.lustre: mount 10.0.0.42 at tcp0:/lustre at /mnt/lustre > failed: No such device > Are the lustre modules loaded? > Check /etc/modprobe.conf and /proc/filesystems > Note ''alias lustre llite'' should be removed from modprobe.conf > --- --- > > > The modprobe on client says, ''module lustre not found''. Any clues? > > Client: Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun > 10 18:49:47 EDT 2008 i686 i686 i386 GNU/Linux > MDS/OST: Linux localhost.localdomain > 2.6.18-92.1.17.el5_lustre.1.8.0smp #1 SMP Wed Feb 18 18:40:54 > MST 2009 i686 i686 i386 GNU/Linux > > Thanks, > CS. > > > > On Mon, Jun 15, 2009 at 5:16 PM, Arden Wiebe > <albert682 at yahoo.com <mailto:albert682 at yahoo.com> > <mailto:albert682 at yahoo.com <mailto:albert682 at yahoo.com>>> wrote: > > > Carlos: > > I''m not clear on which kernel package you tried to install. > There > is pretty much a set order to install the packages from my > understanding of the wording in the manual. From experience: > > rpm -ivh > kernel-lustre-smp-2.6.18-92.1.17.el5_lustre.1.8.0.x86_64.rpm > rpm -ivh > lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp.x86_64.rpm > rpm -ivh > > lustre-ldiskfs-3.0.8-2.6.18_92.1.17.el5_lustre.1.8.0smp.x86_64.rpm > rpm -ivh > > lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp.x86_64.rpm > rpm -Uvh e2fsprogs-1.40.11.sun1-0redhat.rhel5.x86_64.rpm > > Hope that helps as that order has worked for me many times. > > Arden > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
Ms. Megan Larko
2009-Jun-16 17:32 UTC
[Lustre-discuss] Lustre installation and configuration problems
Hi! I concur with Cliff White. The "lustre_rmmod" returns those sort of errors if there is still a Lustre disk mounted. I have found (with Lustre version 1.6.7.1) that I have to unmount all Lustre disks first (hopefully nicely) and then I can run the lustre_rmmod command typically without errors. My problems have come from a hung process which cannot be killed accessing a Lustre disk. I have to cycle power to the Lustre client on which the process is hung and the linux shutdown command hangs on trying to remove the Lustre modules. I can''t remove the modules while the disk is still mounted (perceived active) and kill -9 PID isn''t working. My personal behavior for this is to run the linux shutdown as far it goes (the lustre_rmmod part) and then I physically cycle power on the stuck client box. If there is a better way, I would like to learn it. Cheers! megan
Carlos Santana
2009-Jun-16 18:21 UTC
[Lustre-discuss] Lustre installation and configuration problems
I was able to run lustre_rmmod and depmod successfully. The ''$lctl list_nids'' returned the server ip address and interface (tcp0). I tried to mount the file system on a remote client, but it failed with the following message. --- --- [root at localhost ~]# mount -t lustre 10.0.0.42 at tcp0:/lustre /mnt/lustre mount.lustre: mount 10.0.0.42 at tcp0:/lustre at /mnt/lustre failed: No such device Are the lustre modules loaded? Check /etc/modprobe.conf and /proc/filesystems Note ''alias lustre llite'' should be removed from modprobe.conf --- --- However, the mounting is successful on a single node configuration - with client on the same machine as MDS and OST. Any clues? Where to look for logs and debug messages? Thanks, CS. On Tue, Jun 16, 2009 at 12:16 PM, Cliff White <Cliff.White at sun.com> wrote:> Carlos Santana wrote: > >> Thanks Kevin.. >> >> Please read: > > http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529 > > Those instructions are identical for 1.6 and 1.8. > > For current lustre, only two commands are used for configuration. > mkfs.lustre and mount. > > > Usually when lustre_rmmod returns that error, you run it a second time, and > it will clear things. Unless you have live mounts or network connections. > > cliffw > > > I am referring to 1.8 manual, but I was also referring to HowTo page on >> wiki which seems to be for 1.6. The HowTo page >> http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Toolsmentions abt lmc, lconf, and lctl. >> >> The modules are installed in the right place. The ''$ lustre_rmmod'' >> resulted in following o/p: >> [root at localhost 2.6.18-92.1.17.el5_lustre.1.8.0smp]# lustre_rmmod >> ERROR: Module obdfilter is in use >> ERROR: Module ost is in use >> ERROR: Module mds is in use >> ERROR: Module fsfilt_ldiskfs is in use >> ERROR: Module mgs is in use >> ERROR: Module mgc is in use by mgs >> ERROR: Module ldiskfs is in use by fsfilt_ldiskfs >> ERROR: Module lov is in use >> ERROR: Module lquota is in use by obdfilter,mds >> ERROR: Module osc is in use >> ERROR: Module ksocklnd is in use >> ERROR: Module ptlrpc is in use by obdfilter,ost,mds,mgs,mgc,lov,lquota,osc >> ERROR: Module obdclass is in use by >> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc >> ERROR: Module lnet is in use by ksocklnd,ptlrpc,obdclass >> ERROR: Module lvfs is in use by >> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass >> ERROR: Module libcfs is in use by >> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs >> >> Do I need to shutdown these services? How can I do that? >> >> Thanks, >> CS. >> >> >> On Tue, Jun 16, 2009 at 11:36 AM, Kevin Van Maren <Kevin.Vanmaren at sun.com<mailto: >> Kevin.Vanmaren at sun.com>> wrote: >> >> I think lconf and lmc went away with Lustre 1.6. Are you sure you >> are looking at the 1.8 manual, and not directions for 1.4? >> >> /usr/sbin/lctl should be in the lustre-<version> RPM. Do a: >> # rpm -q -l lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp >> >> >> Do make sure the modules are installed in the right place: >> # cd /lib/modules/`uname -r` >> # find . | grep lustre.ko >> >> If it shows up, then do: >> # lustre_rmmod >> # depmod >> and try again. >> >> Otherwise, figure out where your modules are installed: >> # uname -r >> # cd /lib/modules >> # find . | grep lustre.ko >> >> >> You can also double-check the NID. On the MSD server, do >> # lctl list_nids >> >> Should show 10.0.0.42 at tcp0 >> >> Kevin >> >>-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090616/8f682c1d/attachment.html
Cliff White
2009-Jun-16 18:32 UTC
[Lustre-discuss] Lustre installation and configuration problems
Carlos Santana wrote:> I was able to run lustre_rmmod and depmod successfully. The ''$lctl > list_nids'' returned the server ip address and interface (tcp0). > > I tried to mount the file system on a remote client, but it failed with > the following message. > --- --- > [root at localhost ~]# mount -t lustre 10.0.0.42 at tcp0:/lustre /mnt/lustre > mount.lustre: mount 10.0.0.42 at tcp0:/lustre at /mnt/lustre failed: No > such device > Are the lustre modules loaded? > Check /etc/modprobe.conf and /proc/filesystems > Note ''alias lustre llite'' should be removed from modprobe.conf > --- --- > > However, the mounting is successful on a single node configuration - > with client on the same machine as MDS and OST. > Any clues? Where to look for logs and debug messages?Syslog || /var/log/messages is the normal place. You can use ''lctl ping'' to verify that the client can reach the server. Usually in these cases, it''s a network/name misconfiguration. Run ''tunefs.lustre --print'' on your servers, and verify that mgsnodeis correct. cliffw> > Thanks, > CS. > > > > > On Tue, Jun 16, 2009 at 12:16 PM, Cliff White <Cliff.White at sun.com > <mailto:Cliff.White at sun.com>> wrote: > > Carlos Santana wrote: > > Thanks Kevin.. > > Please read: > http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529 > > Those instructions are identical for 1.6 and 1.8. > > For current lustre, only two commands are used for configuration. > mkfs.lustre and mount. > > > Usually when lustre_rmmod returns that error, you run it a second > time, and it will clear things. Unless you have live mounts or > network connections. > > cliffw > > > I am referring to 1.8 manual, but I was also referring to HowTo > page on wiki which seems to be for 1.6. The HowTo page > http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools > mentions abt lmc, lconf, and lctl. > > The modules are installed in the right place. The ''$ > lustre_rmmod'' resulted in following o/p: > [root at localhost 2.6.18-92.1.17.el5_lustre.1.8.0smp]# lustre_rmmod > ERROR: Module obdfilter is in use > ERROR: Module ost is in use > ERROR: Module mds is in use > ERROR: Module fsfilt_ldiskfs is in use > ERROR: Module mgs is in use > ERROR: Module mgc is in use by mgs > ERROR: Module ldiskfs is in use by fsfilt_ldiskfs > ERROR: Module lov is in use > ERROR: Module lquota is in use by obdfilter,mds > ERROR: Module osc is in use > ERROR: Module ksocklnd is in use > ERROR: Module ptlrpc is in use by > obdfilter,ost,mds,mgs,mgc,lov,lquota,osc > ERROR: Module obdclass is in use by > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc > ERROR: Module lnet is in use by ksocklnd,ptlrpc,obdclass > ERROR: Module lvfs is in use by > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass > ERROR: Module libcfs is in use by > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs > > Do I need to shutdown these services? How can I do that? > > Thanks, > CS. > > > On Tue, Jun 16, 2009 at 11:36 AM, Kevin Van Maren > <Kevin.Vanmaren at sun.com <mailto:Kevin.Vanmaren at sun.com> > <mailto:Kevin.Vanmaren at sun.com <mailto:Kevin.Vanmaren at sun.com>>> > wrote: > > I think lconf and lmc went away with Lustre 1.6. Are you > sure you > are looking at the 1.8 manual, and not directions for 1.4? > > /usr/sbin/lctl should be in the lustre-<version> RPM. Do a: > # rpm -q -l lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > > > Do make sure the modules are installed in the right place: > # cd /lib/modules/`uname -r` > # find . | grep lustre.ko > > If it shows up, then do: > # lustre_rmmod > # depmod > and try again. > > Otherwise, figure out where your modules are installed: > # uname -r > # cd /lib/modules > # find . | grep lustre.ko > > > You can also double-check the NID. On the MSD server, do > # lctl list_nids > > Should show 10.0.0.42 at tcp0 > > Kevin > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
Carlos Santana
2009-Jun-16 19:09 UTC
[Lustre-discuss] Lustre installation and configuration problems
The lctlt ping and ''net up'' failed with the following messages: --- --- [root at localhost ~]# lctl ping 10.0.0.42 opening /dev/lnet failed: No such device hint: the kernel modules may not be loaded failed to ping 10.0.0.42 at tcp: No such device [root at localhost ~]# lctl network up opening /dev/lnet failed: No such device hint: the kernel modules may not be loaded LNET configure error 19: No such device --- --- I tried lustre_rmmod and depmod commands and it did not return any error messages. Any further clues? Reinstall patchless client again? - CS. On Tue, Jun 16, 2009 at 1:32 PM, Cliff White <Cliff.White at sun.com> wrote:> Carlos Santana wrote: > >> I was able to run lustre_rmmod and depmod successfully. The ''$lctl >> list_nids'' returned the server ip address and interface (tcp0). >> >> I tried to mount the file system on a remote client, but it failed with >> the following message. >> --- --- >> [root at localhost ~]# mount -t lustre 10.0.0.42 at tcp0:/lustre /mnt/lustre >> mount.lustre: mount 10.0.0.42 at tcp0:/lustre at /mnt/lustre failed: No such >> device >> Are the lustre modules loaded? >> Check /etc/modprobe.conf and /proc/filesystems >> Note ''alias lustre llite'' should be removed from modprobe.conf >> --- --- >> >> However, the mounting is successful on a single node configuration - with >> client on the same machine as MDS and OST. >> Any clues? Where to look for logs and debug messages? >> > > Syslog || /var/log/messages is the normal place. > > You can use ''lctl ping'' to verify that the client can reach the server. > Usually in these cases, it''s a network/name misconfiguration. > > Run ''tunefs.lustre --print'' on your servers, and verify that mgsnode> is correct. > > cliffw > > >> Thanks, >> CS. >> >> >> >> >> >> On Tue, Jun 16, 2009 at 12:16 PM, Cliff White <Cliff.White at sun.com<mailto: >> Cliff.White at sun.com>> wrote: >> >> Carlos Santana wrote: >> >> Thanks Kevin.. >> >> Please read: >> >> http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529 >> >> Those instructions are identical for 1.6 and 1.8. >> >> For current lustre, only two commands are used for configuration. >> mkfs.lustre and mount. >> >> >> Usually when lustre_rmmod returns that error, you run it a second >> time, and it will clear things. Unless you have live mounts or >> network connections. >> >> cliffw >> >> >> I am referring to 1.8 manual, but I was also referring to HowTo >> page on wiki which seems to be for 1.6. The HowTo page >> >> http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools >> mentions abt lmc, lconf, and lctl. >> >> The modules are installed in the right place. The ''$ >> lustre_rmmod'' resulted in following o/p: >> [root at localhost 2.6.18-92.1.17.el5_lustre.1.8.0smp]# lustre_rmmod >> ERROR: Module obdfilter is in use >> ERROR: Module ost is in use >> ERROR: Module mds is in use >> ERROR: Module fsfilt_ldiskfs is in use >> ERROR: Module mgs is in use >> ERROR: Module mgc is in use by mgs >> ERROR: Module ldiskfs is in use by fsfilt_ldiskfs >> ERROR: Module lov is in use >> ERROR: Module lquota is in use by obdfilter,mds >> ERROR: Module osc is in use >> ERROR: Module ksocklnd is in use >> ERROR: Module ptlrpc is in use by >> obdfilter,ost,mds,mgs,mgc,lov,lquota,osc >> ERROR: Module obdclass is in use by >> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc >> ERROR: Module lnet is in use by ksocklnd,ptlrpc,obdclass >> ERROR: Module lvfs is in use by >> >> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass >> ERROR: Module libcfs is in use by >> >> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs >> >> Do I need to shutdown these services? How can I do that? >> >> Thanks, >> CS. >> >> >> On Tue, Jun 16, 2009 at 11:36 AM, Kevin Van Maren >> <Kevin.Vanmaren at sun.com <mailto:Kevin.Vanmaren at sun.com> >> <mailto:Kevin.Vanmaren at sun.com <mailto:Kevin.Vanmaren at sun.com>>> >> wrote: >> >> I think lconf and lmc went away with Lustre 1.6. Are you >> sure you >> are looking at the 1.8 manual, and not directions for 1.4? >> >> /usr/sbin/lctl should be in the lustre-<version> RPM. Do a: >> # rpm -q -l lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp >> >> >> Do make sure the modules are installed in the right place: >> # cd /lib/modules/`uname -r` >> # find . | grep lustre.ko >> >> If it shows up, then do: >> # lustre_rmmod >> # depmod >> and try again. >> >> Otherwise, figure out where your modules are installed: >> # uname -r >> # cd /lib/modules >> # find . | grep lustre.ko >> >> >> You can also double-check the NID. On the MSD server, do >> # lctl list_nids >> >> Should show 10.0.0.42 at tcp0 >> >> Kevin >> >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090616/a27c8c26/attachment-0001.html
Cliff White
2009-Jun-16 19:28 UTC
[Lustre-discuss] Lustre installation and configuration problems
Carlos Santana wrote:> The lctlt ping and ''net up'' failed with the following messages: > --- --- > [root at localhost ~]# lctl ping 10.0.0.42 > opening /dev/lnet failed: No such device > hint: the kernel modules may not be loaded > failed to ping 10.0.0.42 at tcp: No such device > > [root at localhost ~]# lctl network up > opening /dev/lnet failed: No such device > hint: the kernel modules may not be loaded > LNET configure error 19: No such deviceMake sure modules are unloaded, then try modprobe -v. Looks like you have lnet mis-configured, if your module options are wrong, you will see an error during the modprobe. cliffw> --- --- > > I tried lustre_rmmod and depmod commands and it did not return any error > messages. Any further clues? Reinstall patchless client again? > > - > CS. > > > On Tue, Jun 16, 2009 at 1:32 PM, Cliff White <Cliff.White at sun.com > <mailto:Cliff.White at sun.com>> wrote: > > Carlos Santana wrote: > > I was able to run lustre_rmmod and depmod successfully. The > ''$lctl list_nids'' returned the server ip address and interface > (tcp0). > > I tried to mount the file system on a remote client, but it > failed with the following message. > --- --- > [root at localhost ~]# mount -t lustre 10.0.0.42 at tcp0:/lustre > /mnt/lustre > mount.lustre: mount 10.0.0.42 at tcp0:/lustre at /mnt/lustre > failed: No such device > Are the lustre modules loaded? > Check /etc/modprobe.conf and /proc/filesystems > Note ''alias lustre llite'' should be removed from modprobe.conf > --- --- > > However, the mounting is successful on a single node > configuration - with client on the same machine as MDS and OST. > Any clues? Where to look for logs and debug messages? > > > Syslog || /var/log/messages is the normal place. > > You can use ''lctl ping'' to verify that the client can reach the server. > Usually in these cases, it''s a network/name misconfiguration. > > Run ''tunefs.lustre --print'' on your servers, and verify that mgsnode> is correct. > > cliffw > > > Thanks, > CS. > > > > > > On Tue, Jun 16, 2009 at 12:16 PM, Cliff White > <Cliff.White at sun.com <mailto:Cliff.White at sun.com> > <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>> wrote: > > Carlos Santana wrote: > > Thanks Kevin.. > > Please read: > > http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529 > > Those instructions are identical for 1.6 and 1.8. > > For current lustre, only two commands are used for configuration. > mkfs.lustre and mount. > > > Usually when lustre_rmmod returns that error, you run it a second > time, and it will clear things. Unless you have live mounts or > network connections. > > cliffw > > > I am referring to 1.8 manual, but I was also referring to > HowTo > page on wiki which seems to be for 1.6. The HowTo page > > http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools > mentions abt lmc, lconf, and lctl. > > The modules are installed in the right place. The ''$ > lustre_rmmod'' resulted in following o/p: > [root at localhost 2.6.18-92.1.17.el5_lustre.1.8.0smp]# > lustre_rmmod > ERROR: Module obdfilter is in use > ERROR: Module ost is in use > ERROR: Module mds is in use > ERROR: Module fsfilt_ldiskfs is in use > ERROR: Module mgs is in use > ERROR: Module mgc is in use by mgs > ERROR: Module ldiskfs is in use by fsfilt_ldiskfs > ERROR: Module lov is in use > ERROR: Module lquota is in use by obdfilter,mds > ERROR: Module osc is in use > ERROR: Module ksocklnd is in use > ERROR: Module ptlrpc is in use by > obdfilter,ost,mds,mgs,mgc,lov,lquota,osc > ERROR: Module obdclass is in use by > > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc > ERROR: Module lnet is in use by ksocklnd,ptlrpc,obdclass > ERROR: Module lvfs is in use by > > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass > ERROR: Module libcfs is in use by > > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs > > Do I need to shutdown these services? How can I do that? > > Thanks, > CS. > > > On Tue, Jun 16, 2009 at 11:36 AM, Kevin Van Maren > <Kevin.Vanmaren at sun.com <mailto:Kevin.Vanmaren at sun.com> > <mailto:Kevin.Vanmaren at sun.com <mailto:Kevin.Vanmaren at sun.com>> > <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com>>>> > > wrote: > > I think lconf and lmc went away with Lustre 1.6. Are you > sure you > are looking at the 1.8 manual, and not directions for 1.4? > > /usr/sbin/lctl should be in the lustre-<version> RPM. > Do a: > # rpm -q -l > lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > > > Do make sure the modules are installed in the right place: > # cd /lib/modules/`uname -r` > # find . | grep lustre.ko > > If it shows up, then do: > # lustre_rmmod > # depmod > and try again. > > Otherwise, figure out where your modules are installed: > # uname -r > # cd /lib/modules > # find . | grep lustre.ko > > > You can also double-check the NID. On the MSD server, do > # lctl list_nids > > Should show 10.0.0.42 at tcp0 > > Kevin > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > <mailto:Lustre-discuss at lists.lustre.org> > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
Carlos Santana
2009-Jun-16 20:09 UTC
[Lustre-discuss] Lustre installation and configuration problems
The ''$ modprobe -l lustre*'' did not show any module on a patchless client. modprobe -v returns ''FATAL: Module lustre not found''. How do I install a patchless client? I have tried lustre-client-modules and lustre-client-ver rpm packages in both sequences. Am I missing anything? Thanks, CS. On Tue, Jun 16, 2009 at 2:28 PM, Cliff White <Cliff.White at sun.com> wrote:> Carlos Santana wrote: > >> The lctlt ping and ''net up'' failed with the following messages: >> --- --- >> [root at localhost ~]# lctl ping 10.0.0.42 >> opening /dev/lnet failed: No such device >> hint: the kernel modules may not be loaded >> failed to ping 10.0.0.42 at tcp: No such device >> >> [root at localhost ~]# lctl network up >> opening /dev/lnet failed: No such device >> hint: the kernel modules may not be loaded >> LNET configure error 19: No such device >> > > Make sure modules are unloaded, then try modprobe -v. > Looks like you have lnet mis-configured, if your module options are wrong, > you will see an error during the modprobe. > cliffw > > --- --- >> >> I tried lustre_rmmod and depmod commands and it did not return any error >> messages. Any further clues? Reinstall patchless client again? >> >> - >> CS. >> >> >> On Tue, Jun 16, 2009 at 1:32 PM, Cliff White <Cliff.White at sun.com<mailto: >> Cliff.White at sun.com>> wrote: >> >> Carlos Santana wrote: >> >> I was able to run lustre_rmmod and depmod successfully. The >> ''$lctl list_nids'' returned the server ip address and interface >> (tcp0). >> >> I tried to mount the file system on a remote client, but it >> failed with the following message. >> --- --- >> [root at localhost ~]# mount -t lustre 10.0.0.42 at tcp0:/lustre >> /mnt/lustre >> mount.lustre: mount 10.0.0.42 at tcp0:/lustre at /mnt/lustre >> failed: No such device >> Are the lustre modules loaded? >> Check /etc/modprobe.conf and /proc/filesystems >> Note ''alias lustre llite'' should be removed from modprobe.conf >> --- --- >> >> However, the mounting is successful on a single node >> configuration - with client on the same machine as MDS and OST. >> Any clues? Where to look for logs and debug messages? >> >> >> Syslog || /var/log/messages is the normal place. >> >> You can use ''lctl ping'' to verify that the client can reach the server. >> Usually in these cases, it''s a network/name misconfiguration. >> >> Run ''tunefs.lustre --print'' on your servers, and verify that mgsnode>> is correct. >> >> cliffw >> >> >> Thanks, >> CS. >> >> >> >> >> >> On Tue, Jun 16, 2009 at 12:16 PM, Cliff White >> <Cliff.White at sun.com <mailto:Cliff.White at sun.com> >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>> wrote: >> >> Carlos Santana wrote: >> >> Thanks Kevin.. >> >> Please read: >> >> http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529 >> >> Those instructions are identical for 1.6 and 1.8. >> >> For current lustre, only two commands are used for >> configuration. >> mkfs.lustre and mount. >> >> >> Usually when lustre_rmmod returns that error, you run it a >> second >> time, and it will clear things. Unless you have live mounts or >> network connections. >> >> cliffw >> >> >> I am referring to 1.8 manual, but I was also referring to >> HowTo >> page on wiki which seems to be for 1.6. The HowTo page >> >> http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools >> mentions abt lmc, lconf, and lctl. >> >> The modules are installed in the right place. The ''$ >> lustre_rmmod'' resulted in following o/p: >> [root at localhost 2.6.18-92.1.17.el5_lustre.1.8.0smp]# >> lustre_rmmod >> ERROR: Module obdfilter is in use >> ERROR: Module ost is in use >> ERROR: Module mds is in use >> ERROR: Module fsfilt_ldiskfs is in use >> ERROR: Module mgs is in use >> ERROR: Module mgc is in use by mgs >> ERROR: Module ldiskfs is in use by fsfilt_ldiskfs >> ERROR: Module lov is in use >> ERROR: Module lquota is in use by obdfilter,mds >> ERROR: Module osc is in use >> ERROR: Module ksocklnd is in use >> ERROR: Module ptlrpc is in use by >> obdfilter,ost,mds,mgs,mgc,lov,lquota,osc >> ERROR: Module obdclass is in use by >> >> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc >> ERROR: Module lnet is in use by ksocklnd,ptlrpc,obdclass >> ERROR: Module lvfs is in use by >> >> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass >> ERROR: Module libcfs is in use by >> >> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs >> >> Do I need to shutdown these services? How can I do that? >> >> Thanks, >> CS. >> >> >> On Tue, Jun 16, 2009 at 11:36 AM, Kevin Van Maren >> <Kevin.Vanmaren at sun.com <mailto:Kevin.Vanmaren at sun.com> >> <mailto:Kevin.Vanmaren at sun.com <mailto:Kevin.Vanmaren at sun.com>> >> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com>>>> >> >> wrote: >> >> I think lconf and lmc went away with Lustre 1.6. Are you >> sure you >> are looking at the 1.8 manual, and not directions for >> 1.4? >> >> /usr/sbin/lctl should be in the lustre-<version> RPM. >> Do a: >> # rpm -q -l >> lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp >> >> >> Do make sure the modules are installed in the right >> place: >> # cd /lib/modules/`uname -r` >> # find . | grep lustre.ko >> >> If it shows up, then do: >> # lustre_rmmod >> # depmod >> and try again. >> >> Otherwise, figure out where your modules are installed: >> # uname -r >> # cd /lib/modules >> # find . | grep lustre.ko >> >> >> You can also double-check the NID. On the MSD server, do >> # lctl list_nids >> >> Should show 10.0.0.42 at tcp0 >> >> Kevin >> >> >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> <mailto:Lustre-discuss at lists.lustre.org> >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090616/4a0b4ed1/attachment-0001.html
Cliff White
2009-Jun-16 21:54 UTC
[Lustre-discuss] Lustre installation and configuration problems
Carlos Santana wrote:> The ''$ modprobe -l lustre*'' did not show any module on a patchless > client. modprobe -v returns ''FATAL: Module lustre not found''. > > How do I install a patchless client? > I have tried lustre-client-modules and lustre-client-ver rpm packages in > both sequences. Am I missing anything? >Make sure the lustre-client-modules package matches your running kernel. Run depmod -a to be sure cliffw> Thanks, > CS. > > > > On Tue, Jun 16, 2009 at 2:28 PM, Cliff White <Cliff.White at sun.com > <mailto:Cliff.White at sun.com>> wrote: > > Carlos Santana wrote: > > The lctlt ping and ''net up'' failed with the following messages: > --- --- > [root at localhost ~]# lctl ping 10.0.0.42 > opening /dev/lnet failed: No such device > hint: the kernel modules may not be loaded > failed to ping 10.0.0.42 at tcp: No such device > > [root at localhost ~]# lctl network up > opening /dev/lnet failed: No such device > hint: the kernel modules may not be loaded > LNET configure error 19: No such device > > > Make sure modules are unloaded, then try modprobe -v. > Looks like you have lnet mis-configured, if your module options are > wrong, you will see an error during the modprobe. > cliffw > > --- --- > > > I tried lustre_rmmod and depmod commands and it did not return > any error messages. Any further clues? Reinstall patchless > client again? > > - > CS. > > > On Tue, Jun 16, 2009 at 1:32 PM, Cliff White > <Cliff.White at sun.com <mailto:Cliff.White at sun.com> > <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>> wrote: > > Carlos Santana wrote: > > I was able to run lustre_rmmod and depmod successfully. The > ''$lctl list_nids'' returned the server ip address and > interface > (tcp0). > > I tried to mount the file system on a remote client, but it > failed with the following message. > --- --- > [root at localhost ~]# mount -t lustre 10.0.0.42 at tcp0:/lustre > /mnt/lustre > mount.lustre: mount 10.0.0.42 at tcp0:/lustre at /mnt/lustre > failed: No such device > Are the lustre modules loaded? > Check /etc/modprobe.conf and /proc/filesystems > Note ''alias lustre llite'' should be removed from > modprobe.conf > --- --- > > However, the mounting is successful on a single node > configuration - with client on the same machine as MDS > and OST. > Any clues? Where to look for logs and debug messages? > > > Syslog || /var/log/messages is the normal place. > > You can use ''lctl ping'' to verify that the client can reach > the server. > Usually in these cases, it''s a network/name misconfiguration. > > Run ''tunefs.lustre --print'' on your servers, and verify that > mgsnode> is correct. > > cliffw > > > Thanks, > CS. > > > > > > On Tue, Jun 16, 2009 at 12:16 PM, Cliff White > <Cliff.White at sun.com <mailto:Cliff.White at sun.com> > <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>> > <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com> > <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>>> wrote: > > Carlos Santana wrote: > > Thanks Kevin.. > > Please read: > > http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529 > > Those instructions are identical for 1.6 and 1.8. > > For current lustre, only two commands are used for > configuration. > mkfs.lustre and mount. > > > Usually when lustre_rmmod returns that error, you run > it a second > time, and it will clear things. Unless you have live > mounts or > network connections. > > cliffw > > > I am referring to 1.8 manual, but I was also > referring to > HowTo > page on wiki which seems to be for 1.6. The HowTo page > > http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools > mentions abt lmc, lconf, and lctl. > > The modules are installed in the right place. The ''$ > lustre_rmmod'' resulted in following o/p: > [root at localhost 2.6.18-92.1.17.el5_lustre.1.8.0smp]# > lustre_rmmod > ERROR: Module obdfilter is in use > ERROR: Module ost is in use > ERROR: Module mds is in use > ERROR: Module fsfilt_ldiskfs is in use > ERROR: Module mgs is in use > ERROR: Module mgc is in use by mgs > ERROR: Module ldiskfs is in use by fsfilt_ldiskfs > ERROR: Module lov is in use > ERROR: Module lquota is in use by obdfilter,mds > ERROR: Module osc is in use > ERROR: Module ksocklnd is in use > ERROR: Module ptlrpc is in use by > obdfilter,ost,mds,mgs,mgc,lov,lquota,osc > ERROR: Module obdclass is in use by > > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc > ERROR: Module lnet is in use by > ksocklnd,ptlrpc,obdclass > ERROR: Module lvfs is in use by > > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass > ERROR: Module libcfs is in use by > > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs > > Do I need to shutdown these services? How can I do > that? > > Thanks, > CS. > > > On Tue, Jun 16, 2009 at 11:36 AM, Kevin Van Maren > <Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com>> > <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com>>> > <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com> > <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com>> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com> > <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com>>>>> > > wrote: > > I think lconf and lmc went away with Lustre > 1.6. Are you > sure you > are looking at the 1.8 manual, and not > directions for 1.4? > > /usr/sbin/lctl should be in the > lustre-<version> RPM. > Do a: > # rpm -q -l > lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > > > Do make sure the modules are installed in the > right place: > # cd /lib/modules/`uname -r` > # find . | grep lustre.ko > > If it shows up, then do: > # lustre_rmmod > # depmod > and try again. > > Otherwise, figure out where your modules are > installed: > # uname -r > # cd /lib/modules > # find . | grep lustre.ko > > > You can also double-check the NID. On the MSD > server, do > # lctl list_nids > > Should show 10.0.0.42 at tcp0 > > Kevin > > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > <mailto:Lustre-discuss at lists.lustre.org> > <mailto:Lustre-discuss at lists.lustre.org > <mailto:Lustre-discuss at lists.lustre.org>> > > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > <mailto:Lustre-discuss at lists.lustre.org> > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
Carlos Santana
2009-Jun-17 05:35 UTC
[Lustre-discuss] Lustre installation and configuration problems
Thanks Cliff. The depmod -a was successful before as well. I am using CentOS 5.2 box. Following are the packages installed: [root at localhost tmp]# rpm -qa | grep -i lustre lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp [root at localhost tmp]# uname -a Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun 10 18:49:47 EDT 2008 i686 i686 i386 GNU/Linux And here is a output from strace for mount: http://www.heypasteit.com/clip/8WT Any further debugging hints? Thanks, CS. On 6/16/09, Cliff White <Cliff.White at sun.com> wrote:> Carlos Santana wrote: >> The ''$ modprobe -l lustre*'' did not show any module on a patchless >> client. modprobe -v returns ''FATAL: Module lustre not found''. >> >> How do I install a patchless client? >> I have tried lustre-client-modules and lustre-client-ver rpm packages in >> both sequences. Am I missing anything? >> > > Make sure the lustre-client-modules package matches your running kernel. > Run depmod -a to be sure > cliffw > >> Thanks, >> CS. >> >> >> >> On Tue, Jun 16, 2009 at 2:28 PM, Cliff White <Cliff.White at sun.com >> <mailto:Cliff.White at sun.com>> wrote: >> >> Carlos Santana wrote: >> >> The lctlt ping and ''net up'' failed with the following messages: >> --- --- >> [root at localhost ~]# lctl ping 10.0.0.42 >> opening /dev/lnet failed: No such device >> hint: the kernel modules may not be loaded >> failed to ping 10.0.0.42 at tcp: No such device >> >> [root at localhost ~]# lctl network up >> opening /dev/lnet failed: No such device >> hint: the kernel modules may not be loaded >> LNET configure error 19: No such device >> >> >> Make sure modules are unloaded, then try modprobe -v. >> Looks like you have lnet mis-configured, if your module options are >> wrong, you will see an error during the modprobe. >> cliffw >> >> --- --- >> >> >> I tried lustre_rmmod and depmod commands and it did not return >> any error messages. Any further clues? Reinstall patchless >> client again? >> >> - >> CS. >> >> >> On Tue, Jun 16, 2009 at 1:32 PM, Cliff White >> <Cliff.White at sun.com <mailto:Cliff.White at sun.com> >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>> wrote: >> >> Carlos Santana wrote: >> >> I was able to run lustre_rmmod and depmod successfully. The >> ''$lctl list_nids'' returned the server ip address and >> interface >> (tcp0). >> >> I tried to mount the file system on a remote client, but it >> failed with the following message. >> --- --- >> [root at localhost ~]# mount -t lustre 10.0.0.42 at tcp0:/lustre >> /mnt/lustre >> mount.lustre: mount 10.0.0.42 at tcp0:/lustre at /mnt/lustre >> failed: No such device >> Are the lustre modules loaded? >> Check /etc/modprobe.conf and /proc/filesystems >> Note ''alias lustre llite'' should be removed from >> modprobe.conf >> --- --- >> >> However, the mounting is successful on a single node >> configuration - with client on the same machine as MDS >> and OST. >> Any clues? Where to look for logs and debug messages? >> >> >> Syslog || /var/log/messages is the normal place. >> >> You can use ''lctl ping'' to verify that the client can reach >> the server. >> Usually in these cases, it''s a network/name misconfiguration. >> >> Run ''tunefs.lustre --print'' on your servers, and verify that >> mgsnode>> is correct. >> >> cliffw >> >> >> Thanks, >> CS. >> >> >> >> >> >> On Tue, Jun 16, 2009 at 12:16 PM, Cliff White >> <Cliff.White at sun.com <mailto:Cliff.White at sun.com> >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>> >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com> >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>>> wrote: >> >> Carlos Santana wrote: >> >> Thanks Kevin.. >> >> Please read: >> >> >> http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529 >> >> Those instructions are identical for 1.6 and 1.8. >> >> For current lustre, only two commands are used for >> configuration. >> mkfs.lustre and mount. >> >> >> Usually when lustre_rmmod returns that error, you run >> it a second >> time, and it will clear things. Unless you have live >> mounts or >> network connections. >> >> cliffw >> >> >> I am referring to 1.8 manual, but I was also >> referring to >> HowTo >> page on wiki which seems to be for 1.6. The HowTo >> page >> >> >> http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools >> mentions abt lmc, lconf, and lctl. >> >> The modules are installed in the right place. The ''$ >> lustre_rmmod'' resulted in following o/p: >> [root at localhost 2.6.18-92.1.17.el5_lustre.1.8.0smp]# >> lustre_rmmod >> ERROR: Module obdfilter is in use >> ERROR: Module ost is in use >> ERROR: Module mds is in use >> ERROR: Module fsfilt_ldiskfs is in use >> ERROR: Module mgs is in use >> ERROR: Module mgc is in use by mgs >> ERROR: Module ldiskfs is in use by fsfilt_ldiskfs >> ERROR: Module lov is in use >> ERROR: Module lquota is in use by obdfilter,mds >> ERROR: Module osc is in use >> ERROR: Module ksocklnd is in use >> ERROR: Module ptlrpc is in use by >> obdfilter,ost,mds,mgs,mgc,lov,lquota,osc >> ERROR: Module obdclass is in use by >> >> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc >> ERROR: Module lnet is in use by >> ksocklnd,ptlrpc,obdclass >> ERROR: Module lvfs is in use by >> >> >> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass >> ERROR: Module libcfs is in use by >> >> >> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs >> >> Do I need to shutdown these services? How can I do >> that? >> >> Thanks, >> CS. >> >> >> On Tue, Jun 16, 2009 at 11:36 AM, Kevin Van Maren >> <Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com>> >> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com>>> >> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com> >> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com>> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com> >> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com>>>>> >> >> wrote: >> >> I think lconf and lmc went away with Lustre >> 1.6. Are you >> sure you >> are looking at the 1.8 manual, and not >> directions for 1.4? >> >> /usr/sbin/lctl should be in the >> lustre-<version> RPM. >> Do a: >> # rpm -q -l >> lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp >> >> >> Do make sure the modules are installed in the >> right place: >> # cd /lib/modules/`uname -r` >> # find . | grep lustre.ko >> >> If it shows up, then do: >> # lustre_rmmod >> # depmod >> and try again. >> >> Otherwise, figure out where your modules are >> installed: >> # uname -r >> # cd /lib/modules >> # find . | grep lustre.ko >> >> >> You can also double-check the NID. On the MSD >> server, do >> # lctl list_nids >> >> Should show 10.0.0.42 at tcp0 >> >> Kevin >> >> >> >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> <mailto:Lustre-discuss at lists.lustre.org> >> <mailto:Lustre-discuss at lists.lustre.org >> <mailto:Lustre-discuss at lists.lustre.org>> >> >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> >> >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> <mailto:Lustre-discuss at lists.lustre.org> >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > >
Carlos Santana
2009-Jun-17 15:18 UTC
[Lustre-discuss] Lustre installation and configuration problems
Huh... :( Sorry to bug you guys again... I am planning to make a fresh start now as nothing seems to have worked for me. If you have any comments/feedback please share them. I would like to confirm installation order before I make a fresh start. From Arden''s experience: http://lists.lustre.org/pipermail/lustre-discuss/2009-June/010710.html , the lusre-module is installed last. As I was installing Lustre 1.8, I was referring 1.8 operations manual http://manual.lustre.org/index.php?title=Main_Page . The installation order in the manual is different than what Arden has suggested. Will it make a difference in configuration at later stage? Which one should I follow now? Any comments? Thanks, CS. On Wed, Jun 17, 2009 at 12:35 AM, Carlos Santana <neubyr at gmail.com> wrote:> Thanks Cliff. > > The depmod -a was successful before as well. I am using CentOS 5.2 > box. Following are the packages installed: > [root at localhost tmp]# rpm -qa | grep -i lustre > lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > > [root at localhost tmp]# uname -a > Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun 10 18:49:47 > EDT 2008 i686 i686 i386 GNU/Linux > > And here is a output from strace for mount: > http://www.heypasteit.com/clip/8WT > > Any further debugging hints? > > Thanks, > CS. > > On 6/16/09, Cliff White <Cliff.White at sun.com> wrote: > > Carlos Santana wrote: > >> The ''$ modprobe -l lustre*'' did not show any module on a patchless > >> client. modprobe -v returns ''FATAL: Module lustre not found''. > >> > >> How do I install a patchless client? > >> I have tried lustre-client-modules and lustre-client-ver rpm packages in > >> both sequences. Am I missing anything? > >> > > > > Make sure the lustre-client-modules package matches your running kernel. > > Run depmod -a to be sure > > cliffw > > > >> Thanks, > >> CS. > >> > >> > >> > >> On Tue, Jun 16, 2009 at 2:28 PM, Cliff White <Cliff.White at sun.com > >> <mailto:Cliff.White at sun.com>> wrote: > >> > >> Carlos Santana wrote: > >> > >> The lctlt ping and ''net up'' failed with the following messages: > >> --- --- > >> [root at localhost ~]# lctl ping 10.0.0.42 > >> opening /dev/lnet failed: No such device > >> hint: the kernel modules may not be loaded > >> failed to ping 10.0.0.42 at tcp: No such device > >> > >> [root at localhost ~]# lctl network up > >> opening /dev/lnet failed: No such device > >> hint: the kernel modules may not be loaded > >> LNET configure error 19: No such device > >> > >> > >> Make sure modules are unloaded, then try modprobe -v. > >> Looks like you have lnet mis-configured, if your module options are > >> wrong, you will see an error during the modprobe. > >> cliffw > >> > >> --- --- > >> > >> > >> I tried lustre_rmmod and depmod commands and it did not return > >> any error messages. Any further clues? Reinstall patchless > >> client again? > >> > >> - > >> CS. > >> > >> > >> On Tue, Jun 16, 2009 at 1:32 PM, Cliff White > >> <Cliff.White at sun.com <mailto:Cliff.White at sun.com> > >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>> > wrote: > >> > >> Carlos Santana wrote: > >> > >> I was able to run lustre_rmmod and depmod successfully. > The > >> ''$lctl list_nids'' returned the server ip address and > >> interface > >> (tcp0). > >> > >> I tried to mount the file system on a remote client, but > it > >> failed with the following message. > >> --- --- > >> [root at localhost ~]# mount -t lustre 10.0.0.42 at tcp0 > :/lustre > >> /mnt/lustre > >> mount.lustre: mount 10.0.0.42 at tcp0:/lustre at > /mnt/lustre > >> failed: No such device > >> Are the lustre modules loaded? > >> Check /etc/modprobe.conf and /proc/filesystems > >> Note ''alias lustre llite'' should be removed from > >> modprobe.conf > >> --- --- > >> > >> However, the mounting is successful on a single node > >> configuration - with client on the same machine as MDS > >> and OST. > >> Any clues? Where to look for logs and debug messages? > >> > >> > >> Syslog || /var/log/messages is the normal place. > >> > >> You can use ''lctl ping'' to verify that the client can reach > >> the server. > >> Usually in these cases, it''s a network/name misconfiguration. > >> > >> Run ''tunefs.lustre --print'' on your servers, and verify that > >> mgsnode> >> is correct. > >> > >> cliffw > >> > >> > >> Thanks, > >> CS. > >> > >> > >> > >> > >> > >> On Tue, Jun 16, 2009 at 12:16 PM, Cliff White > >> <Cliff.White at sun.com <mailto:Cliff.White at sun.com> > >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>> > >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com> > >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>>> > wrote: > >> > >> Carlos Santana wrote: > >> > >> Thanks Kevin.. > >> > >> Please read: > >> > >> > >> > http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529 > >> > >> Those instructions are identical for 1.6 and 1.8. > >> > >> For current lustre, only two commands are used for > >> configuration. > >> mkfs.lustre and mount. > >> > >> > >> Usually when lustre_rmmod returns that error, you run > >> it a second > >> time, and it will clear things. Unless you have live > >> mounts or > >> network connections. > >> > >> cliffw > >> > >> > >> I am referring to 1.8 manual, but I was also > >> referring to > >> HowTo > >> page on wiki which seems to be for 1.6. The HowTo > >> page > >> > >> > >> > http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools > >> mentions abt lmc, lconf, and lctl. > >> > >> The modules are installed in the right place. The > ''$ > >> lustre_rmmod'' resulted in following o/p: > >> [root at localhost2.6.18-92.1.17.el5_lustre.1.8.0smp]# > >> lustre_rmmod > >> ERROR: Module obdfilter is in use > >> ERROR: Module ost is in use > >> ERROR: Module mds is in use > >> ERROR: Module fsfilt_ldiskfs is in use > >> ERROR: Module mgs is in use > >> ERROR: Module mgc is in use by mgs > >> ERROR: Module ldiskfs is in use by fsfilt_ldiskfs > >> ERROR: Module lov is in use > >> ERROR: Module lquota is in use by obdfilter,mds > >> ERROR: Module osc is in use > >> ERROR: Module ksocklnd is in use > >> ERROR: Module ptlrpc is in use by > >> obdfilter,ost,mds,mgs,mgc,lov,lquota,osc > >> ERROR: Module obdclass is in use by > >> > >> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc > >> ERROR: Module lnet is in use by > >> ksocklnd,ptlrpc,obdclass > >> ERROR: Module lvfs is in use by > >> > >> > >> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass > >> ERROR: Module libcfs is in use by > >> > >> > >> > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs > >> > >> Do I need to shutdown these services? How can I do > >> that? > >> > >> Thanks, > >> CS. > >> > >> > >> On Tue, Jun 16, 2009 at 11:36 AM, Kevin Van Maren > >> <Kevin.Vanmaren at sun.com > >> <mailto:Kevin.Vanmaren at sun.com> <mailto:Kevin.Vanmaren at sun.com > >> <mailto:Kevin.Vanmaren at sun.com>> > >> <mailto:Kevin.Vanmaren at sun.com > >> <mailto:Kevin.Vanmaren at sun.com> <mailto:Kevin.Vanmaren at sun.com > >> <mailto:Kevin.Vanmaren at sun.com>>> > >> <mailto:Kevin.Vanmaren at sun.com > >> <mailto:Kevin.Vanmaren at sun.com> > >> <mailto:Kevin.Vanmaren at sun.com > >> <mailto:Kevin.Vanmaren at sun.com>> <mailto:Kevin.Vanmaren at sun.com > >> <mailto:Kevin.Vanmaren at sun.com> > >> <mailto:Kevin.Vanmaren at sun.com > >> <mailto:Kevin.Vanmaren at sun.com>>>>> > >> > >> wrote: > >> > >> I think lconf and lmc went away with Lustre > >> 1.6. Are you > >> sure you > >> are looking at the 1.8 manual, and not > >> directions for 1.4? > >> > >> /usr/sbin/lctl should be in the > >> lustre-<version> RPM. > >> Do a: > >> # rpm -q -l > >> lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > >> > >> > >> Do make sure the modules are installed in the > >> right place: > >> # cd /lib/modules/`uname -r` > >> # find . | grep lustre.ko > >> > >> If it shows up, then do: > >> # lustre_rmmod > >> # depmod > >> and try again. > >> > >> Otherwise, figure out where your modules are > >> installed: > >> # uname -r > >> # cd /lib/modules > >> # find . | grep lustre.ko > >> > >> > >> You can also double-check the NID. On the MSD > >> server, do > >> # lctl list_nids > >> > >> Should show 10.0.0.42 at tcp0 > >> > >> Kevin > >> > >> > >> > >> > >> > >> ------------------------------------------------------------------------ > >> > >> _______________________________________________ > >> Lustre-discuss mailing list > >> Lustre-discuss at lists.lustre.org > >> <mailto:Lustre-discuss at lists.lustre.org> > >> <mailto:Lustre-discuss at lists.lustre.org > >> <mailto:Lustre-discuss at lists.lustre.org>> > >> > >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > >> > >> > >> > >> > >> > >> ------------------------------------------------------------------------ > >> > >> _______________________________________________ > >> Lustre-discuss mailing list > >> Lustre-discuss at lists.lustre.org > >> <mailto:Lustre-discuss at lists.lustre.org> > >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > >> > >> > >> > >> > >> ------------------------------------------------------------------------ > >> > >> _______________________________________________ > >> Lustre-discuss mailing list > >> Lustre-discuss at lists.lustre.org > >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090617/0cf42dd4/attachment-0001.html
Carlos Santana
2009-Jun-17 15:20 UTC
[Lustre-discuss] Lustre installation and configuration problems
And is there any specific installation order for patchless client? Could someone please share it with me? - CS. On Wed, Jun 17, 2009 at 10:18 AM, Carlos Santana <neubyr at gmail.com> wrote:> Huh... :( Sorry to bug you guys again... > > I am planning to make a fresh start now as nothing seems to have worked for > me. If you have any comments/feedback please share them. > > I would like to confirm installation order before I make a fresh start. > From Arden''s experience: > http://lists.lustre.org/pipermail/lustre-discuss/2009-June/010710.html , > the lusre-module is installed last. As I was installing Lustre 1.8, I was > referring 1.8 operations manual > http://manual.lustre.org/index.php?title=Main_Page . The installation > order in the manual is different than what Arden has suggested. > > Will it make a difference in configuration at later stage? Which one should > I follow now? > Any comments? > > Thanks, > CS. > > > On Wed, Jun 17, 2009 at 12:35 AM, Carlos Santana <neubyr at gmail.com> wrote: > >> Thanks Cliff. >> >> The depmod -a was successful before as well. I am using CentOS 5.2 >> box. Following are the packages installed: >> [root at localhost tmp]# rpm -qa | grep -i lustre >> lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp >> lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp >> >> [root at localhost tmp]# uname -a >> Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun 10 18:49:47 >> EDT 2008 i686 i686 i386 GNU/Linux >> >> And here is a output from strace for mount: >> http://www.heypasteit.com/clip/8WT >> >> Any further debugging hints? >> >> Thanks, >> CS. >> >> On 6/16/09, Cliff White <Cliff.White at sun.com> wrote: >> > Carlos Santana wrote: >> >> The ''$ modprobe -l lustre*'' did not show any module on a patchless >> >> client. modprobe -v returns ''FATAL: Module lustre not found''. >> >> >> >> How do I install a patchless client? >> >> I have tried lustre-client-modules and lustre-client-ver rpm packages >> in >> >> both sequences. Am I missing anything? >> >> >> > >> > Make sure the lustre-client-modules package matches your running kernel. >> > Run depmod -a to be sure >> > cliffw >> > >> >> Thanks, >> >> CS. >> >> >> >> >> >> >> >> On Tue, Jun 16, 2009 at 2:28 PM, Cliff White <Cliff.White at sun.com >> >> <mailto:Cliff.White at sun.com>> wrote: >> >> >> >> Carlos Santana wrote: >> >> >> >> The lctlt ping and ''net up'' failed with the following messages: >> >> --- --- >> >> [root at localhost ~]# lctl ping 10.0.0.42 >> >> opening /dev/lnet failed: No such device >> >> hint: the kernel modules may not be loaded >> >> failed to ping 10.0.0.42 at tcp: No such device >> >> >> >> [root at localhost ~]# lctl network up >> >> opening /dev/lnet failed: No such device >> >> hint: the kernel modules may not be loaded >> >> LNET configure error 19: No such device >> >> >> >> >> >> Make sure modules are unloaded, then try modprobe -v. >> >> Looks like you have lnet mis-configured, if your module options are >> >> wrong, you will see an error during the modprobe. >> >> cliffw >> >> >> >> --- --- >> >> >> >> >> >> I tried lustre_rmmod and depmod commands and it did not return >> >> any error messages. Any further clues? Reinstall patchless >> >> client again? >> >> >> >> - >> >> CS. >> >> >> >> >> >> On Tue, Jun 16, 2009 at 1:32 PM, Cliff White >> >> <Cliff.White at sun.com <mailto:Cliff.White at sun.com> >> >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>> >> wrote: >> >> >> >> Carlos Santana wrote: >> >> >> >> I was able to run lustre_rmmod and depmod successfully. >> The >> >> ''$lctl list_nids'' returned the server ip address and >> >> interface >> >> (tcp0). >> >> >> >> I tried to mount the file system on a remote client, but >> it >> >> failed with the following message. >> >> --- --- >> >> [root at localhost ~]# mount -t lustre 10.0.0.42 at tcp0 >> :/lustre >> >> /mnt/lustre >> >> mount.lustre: mount 10.0.0.42 at tcp0:/lustre at >> /mnt/lustre >> >> failed: No such device >> >> Are the lustre modules loaded? >> >> Check /etc/modprobe.conf and /proc/filesystems >> >> Note ''alias lustre llite'' should be removed from >> >> modprobe.conf >> >> --- --- >> >> >> >> However, the mounting is successful on a single node >> >> configuration - with client on the same machine as MDS >> >> and OST. >> >> Any clues? Where to look for logs and debug messages? >> >> >> >> >> >> Syslog || /var/log/messages is the normal place. >> >> >> >> You can use ''lctl ping'' to verify that the client can reach >> >> the server. >> >> Usually in these cases, it''s a network/name >> misconfiguration. >> >> >> >> Run ''tunefs.lustre --print'' on your servers, and verify that >> >> mgsnode>> >> is correct. >> >> >> >> cliffw >> >> >> >> >> >> Thanks, >> >> CS. >> >> >> >> >> >> >> >> >> >> >> >> On Tue, Jun 16, 2009 at 12:16 PM, Cliff White >> >> <Cliff.White at sun.com <mailto:Cliff.White at sun.com> >> >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>> >> >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com >> > >> >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>>> >> wrote: >> >> >> >> Carlos Santana wrote: >> >> >> >> Thanks Kevin.. >> >> >> >> Please read: >> >> >> >> >> >> >> http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529 >> >> >> >> Those instructions are identical for 1.6 and 1.8. >> >> >> >> For current lustre, only two commands are used for >> >> configuration. >> >> mkfs.lustre and mount. >> >> >> >> >> >> Usually when lustre_rmmod returns that error, you run >> >> it a second >> >> time, and it will clear things. Unless you have live >> >> mounts or >> >> network connections. >> >> >> >> cliffw >> >> >> >> >> >> I am referring to 1.8 manual, but I was also >> >> referring to >> >> HowTo >> >> page on wiki which seems to be for 1.6. The HowTo >> >> page >> >> >> >> >> >> >> http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools >> >> mentions abt lmc, lconf, and lctl. >> >> >> >> The modules are installed in the right place. The >> ''$ >> >> lustre_rmmod'' resulted in following o/p: >> >> [root at localhost2.6.18-92.1.17.el5_lustre.1.8.0smp]# >> >> lustre_rmmod >> >> ERROR: Module obdfilter is in use >> >> ERROR: Module ost is in use >> >> ERROR: Module mds is in use >> >> ERROR: Module fsfilt_ldiskfs is in use >> >> ERROR: Module mgs is in use >> >> ERROR: Module mgc is in use by mgs >> >> ERROR: Module ldiskfs is in use by fsfilt_ldiskfs >> >> ERROR: Module lov is in use >> >> ERROR: Module lquota is in use by obdfilter,mds >> >> ERROR: Module osc is in use >> >> ERROR: Module ksocklnd is in use >> >> ERROR: Module ptlrpc is in use by >> >> obdfilter,ost,mds,mgs,mgc,lov,lquota,osc >> >> ERROR: Module obdclass is in use by >> >> >> >> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc >> >> ERROR: Module lnet is in use by >> >> ksocklnd,ptlrpc,obdclass >> >> ERROR: Module lvfs is in use by >> >> >> >> >> >> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass >> >> ERROR: Module libcfs is in use by >> >> >> >> >> >> >> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs >> >> >> >> Do I need to shutdown these services? How can I >> do >> >> that? >> >> >> >> Thanks, >> >> CS. >> >> >> >> >> >> On Tue, Jun 16, 2009 at 11:36 AM, Kevin Van Maren >> >> <Kevin.Vanmaren at sun.com >> >> <mailto:Kevin.Vanmaren at sun.com> <mailto:Kevin.Vanmaren at sun.com >> >> <mailto:Kevin.Vanmaren at sun.com>> >> >> <mailto:Kevin.Vanmaren at sun.com >> >> <mailto:Kevin.Vanmaren at sun.com> <mailto:Kevin.Vanmaren at sun.com >> >> <mailto:Kevin.Vanmaren at sun.com>>> >> >> <mailto:Kevin.Vanmaren at sun.com >> >> <mailto:Kevin.Vanmaren at sun.com> >> >> <mailto:Kevin.Vanmaren at sun.com >> >> <mailto:Kevin.Vanmaren at sun.com>> <mailto: >> Kevin.Vanmaren at sun.com >> >> <mailto:Kevin.Vanmaren at sun.com> >> >> <mailto:Kevin.Vanmaren at sun.com >> >> <mailto:Kevin.Vanmaren at sun.com>>>>> >> >> >> >> wrote: >> >> >> >> I think lconf and lmc went away with Lustre >> >> 1.6. Are you >> >> sure you >> >> are looking at the 1.8 manual, and not >> >> directions for 1.4? >> >> >> >> /usr/sbin/lctl should be in the >> >> lustre-<version> RPM. >> >> Do a: >> >> # rpm -q -l >> >> lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp >> >> >> >> >> >> Do make sure the modules are installed in the >> >> right place: >> >> # cd /lib/modules/`uname -r` >> >> # find . | grep lustre.ko >> >> >> >> If it shows up, then do: >> >> # lustre_rmmod >> >> # depmod >> >> and try again. >> >> >> >> Otherwise, figure out where your modules are >> >> installed: >> >> # uname -r >> >> # cd /lib/modules >> >> # find . | grep lustre.ko >> >> >> >> >> >> You can also double-check the NID. On the MSD >> >> server, do >> >> # lctl list_nids >> >> >> >> Should show 10.0.0.42 at tcp0 >> >> >> >> Kevin >> >> >> >> >> >> >> >> >> >> >> >> >> ------------------------------------------------------------------------ >> >> >> >> _______________________________________________ >> >> Lustre-discuss mailing list >> >> Lustre-discuss at lists.lustre.org >> >> <mailto:Lustre-discuss at lists.lustre.org> >> >> <mailto:Lustre-discuss at lists.lustre.org >> >> <mailto:Lustre-discuss at lists.lustre.org>> >> >> >> >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> >> >> >> >> >> >> >> >> >> >> >> ------------------------------------------------------------------------ >> >> >> >> _______________________________________________ >> >> Lustre-discuss mailing list >> >> Lustre-discuss at lists.lustre.org >> >> <mailto:Lustre-discuss at lists.lustre.org> >> >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> >> >> >> >> >> >> >> >> >> ------------------------------------------------------------------------ >> >> >> >> _______________________________________________ >> >> Lustre-discuss mailing list >> >> Lustre-discuss at lists.lustre.org >> >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> > >> > >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090617/45cab3ee/attachment-0001.html
Jerome, Ron
2009-Jun-17 15:40 UTC
[Lustre-discuss] Lustre installation and configuration problems
I think the problem you have, as Cliff alluded to, is a mismatch between your kernel version and the Luster kernel version modules. You have kernel "2.6.18-92.el5" and are installing Lustre "2.6.18_92.1.17.el5" Note the ".1.17" is significant as the modules will end up in the wrong directory. There is an update to CentOS to bring the kernel to the matching 2.6.18_92.1.17.el5 version you can pull it off the CentOS mirror site in the updates directory. Ron. From: lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Carlos Santana Sent: June 17, 2009 11:21 AM To: lustre-discuss at lists.lustre.org Subject: Re: [Lustre-discuss] Lustre installation and configuration problems And is there any specific installation order for patchless client? Could someone please share it with me? - CS. On Wed, Jun 17, 2009 at 10:18 AM, Carlos Santana <neubyr at gmail.com> wrote: Huh... :( Sorry to bug you guys again... I am planning to make a fresh start now as nothing seems to have worked for me. If you have any comments/feedback please share them. I would like to confirm installation order before I make a fresh start.>From Arden''s experience:http://lists.lustre.org/pipermail/lustre-discuss/2009-June/010710.html , the lusre-module is installed last. As I was installing Lustre 1.8, I was referring 1.8 operations manual http://manual.lustre.org/index.php?title=Main_Page . The installation order in the manual is different than what Arden has suggested. Will it make a difference in configuration at later stage? Which one should I follow now? Any comments? Thanks, CS. On Wed, Jun 17, 2009 at 12:35 AM, Carlos Santana <neubyr at gmail.com> wrote: Thanks Cliff. The depmod -a was successful before as well. I am using CentOS 5.2 box. Following are the packages installed: [root at localhost tmp]# rpm -qa | grep -i lustre lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp [root at localhost tmp]# uname -a Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun 10 18:49:47 EDT 2008 i686 i686 i386 GNU/Linux And here is a output from strace for mount: http://www.heypasteit.com/clip/8WT Any further debugging hints? Thanks, CS. On 6/16/09, Cliff White <Cliff.White at sun.com> wrote:> Carlos Santana wrote: >> The ''$ modprobe -l lustre*'' did not show any module on a patchless >> client. modprobe -v returns ''FATAL: Module lustre not found''. >> >> How do I install a patchless client? >> I have tried lustre-client-modules and lustre-client-ver rpm packagesin>> both sequences. Am I missing anything? >> > > Make sure the lustre-client-modules package matches your runningkernel.> Run depmod -a to be sure > cliffw > >> Thanks, >> CS. >> >> >> >> On Tue, Jun 16, 2009 at 2:28 PM, Cliff White <Cliff.White at sun.com >> <mailto:Cliff.White at sun.com>> wrote: >> >> Carlos Santana wrote: >> >> The lctlt ping and ''net up'' failed with the followingmessages:>> --- --- >> [root at localhost ~]# lctl ping 10.0.0.42 >> opening /dev/lnet failed: No such device >> hint: the kernel modules may not be loaded >> failed to ping 10.0.0.42 at tcp: No such device >> >> [root at localhost ~]# lctl network up >> opening /dev/lnet failed: No such device >> hint: the kernel modules may not be loaded >> LNET configure error 19: No such device >> >> >> Make sure modules are unloaded, then try modprobe -v. >> Looks like you have lnet mis-configured, if your module optionsare>> wrong, you will see an error during the modprobe. >> cliffw >> >> --- --- >> >> >> I tried lustre_rmmod and depmod commands and it did notreturn>> any error messages. Any further clues? Reinstall patchless >> client again? >> >> - >> CS. >> >> >> On Tue, Jun 16, 2009 at 1:32 PM, Cliff White >> <Cliff.White at sun.com <mailto:Cliff.White at sun.com> >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>>wrote:>> >> Carlos Santana wrote: >> >> I was able to run lustre_rmmod and depmodsuccessfully. The>> ''$lctl list_nids'' returned the server ip address and >> interface >> (tcp0). >> >> I tried to mount the file system on a remote client,but it>> failed with the following message. >> --- --- >> [root at localhost ~]# mount -t lustre10.0.0.42 at tcp0:/lustre>> /mnt/lustre >> mount.lustre: mount 10.0.0.42 at tcp0:/lustre at/mnt/lustre>> failed: No such device >> Are the lustre modules loaded? >> Check /etc/modprobe.conf and /proc/filesystems >> Note ''alias lustre llite'' should be removed from >> modprobe.conf >> --- --- >> >> However, the mounting is successful on a single node >> configuration - with client on the same machine as MDS >> and OST. >> Any clues? Where to look for logs and debug messages? >> >> >> Syslog || /var/log/messages is the normal place. >> >> You can use ''lctl ping'' to verify that the client canreach>> the server. >> Usually in these cases, it''s a network/namemisconfiguration.>> >> Run ''tunefs.lustre --print'' on your servers, and verifythat>> mgsnode>> is correct. >> >> cliffw >> >> >> Thanks, >> CS. >> >> >> >> >> >> On Tue, Jun 16, 2009 at 12:16 PM, Cliff White >> <Cliff.White at sun.com <mailto:Cliff.White at sun.com> >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>> >> <mailto:Cliff.White at sun.com<mailto:Cliff.White at sun.com>>> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>>>wrote:>> >> Carlos Santana wrote: >> >> Thanks Kevin.. >> >> Please read: >> >> >>http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.ht ml#50401328_pgfId-1289529>> >> Those instructions are identical for 1.6 and 1.8. >> >> For current lustre, only two commands are used for >> configuration. >> mkfs.lustre and mount. >> >> >> Usually when lustre_rmmod returns that error, yourun>> it a second >> time, and it will clear things. Unless you havelive>> mounts or >> network connections. >> >> cliffw >> >> >> I am referring to 1.8 manual, but I was also >> referring to >> HowTo >> page on wiki which seems to be for 1.6. TheHowTo>> page >> >> >>http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configurati on_Tools>> mentions abt lmc, lconf, and lctl. >> >> The modules are installed in the right place.The ''$>> lustre_rmmod'' resulted in following o/p: >> [root at localhost2.6.18-92.1.17.el5_lustre.1.8.0smp]#>> lustre_rmmod >> ERROR: Module obdfilter is in use >> ERROR: Module ost is in use >> ERROR: Module mds is in use >> ERROR: Module fsfilt_ldiskfs is in use >> ERROR: Module mgs is in use >> ERROR: Module mgc is in use by mgs >> ERROR: Module ldiskfs is in use byfsfilt_ldiskfs>> ERROR: Module lov is in use >> ERROR: Module lquota is in use by obdfilter,mds >> ERROR: Module osc is in use >> ERROR: Module ksocklnd is in use >> ERROR: Module ptlrpc is in use by >> obdfilter,ost,mds,mgs,mgc,lov,lquota,osc >> ERROR: Module obdclass is in use by >> >>obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc>> ERROR: Module lnet is in use by >> ksocklnd,ptlrpc,obdclass >> ERROR: Module lvfs is in use by >> >> >>obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass>> ERROR: Module libcfs is in use by >> >> >>obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc, obdclass,lnet,lvfs>> >> Do I need to shutdown these services? How can Ido>> that? >> >> Thanks, >> CS. >> >> >> On Tue, Jun 16, 2009 at 11:36 AM, Kevin VanMaren>> <Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com><mailto:Kevin.Vanmaren at sun.com>> <mailto:Kevin.Vanmaren at sun.com>> >> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com><mailto:Kevin.Vanmaren at sun.com>> <mailto:Kevin.Vanmaren at sun.com>>> >> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com> >> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com>><mailto:Kevin.Vanmaren at sun.com>> <mailto:Kevin.Vanmaren at sun.com> >> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com>>>>> >> >> wrote: >> >> I think lconf and lmc went away with Lustre >> 1.6. Are you >> sure you >> are looking at the 1.8 manual, and not >> directions for 1.4? >> >> /usr/sbin/lctl should be in the >> lustre-<version> RPM. >> Do a: >> # rpm -q -l >> lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp >> >> >> Do make sure the modules are installed inthe>> right place: >> # cd /lib/modules/`uname -r` >> # find . | grep lustre.ko >> >> If it shows up, then do: >> # lustre_rmmod >> # depmod >> and try again. >> >> Otherwise, figure out where your modules are >> installed: >> # uname -r >> # cd /lib/modules >> # find . | grep lustre.ko >> >> >> You can also double-check the NID. On theMSD>> server, do >> # lctl list_nids >> >> Should show 10.0.0.42 at tcp0 >> >> Kevin >> >> >> >> >> >>------------------------------------------------------------------------>> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> <mailto:Lustre-discuss at lists.lustre.org> >> <mailto:Lustre-discuss at lists.lustre.org >> <mailto:Lustre-discuss at lists.lustre.org>> >> >>http://lists.lustre.org/mailman/listinfo/lustre-discuss>> >> >> >> >> >>------------------------------------------------------------------------>> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> <mailto:Lustre-discuss at lists.lustre.org> >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> >> >> >>------------------------------------------------------------------------>> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090617/abfbb300/attachment-0001.html
Dr. Hung-Sheng Tsao (LaoTsao)
2009-Jun-17 15:43 UTC
[Lustre-discuss] Lustre installation and configuration problems
http://blogs.sun.com/manoj/entry/lustre_demo_flash http://blogs.sun.com/manoj/entry/lustre_installation_multi_node Carlos Santana wrote:> And is there any specific installation order for patchless client? > Could someone please share it with me? > > - > CS. > > On Wed, Jun 17, 2009 at 10:18 AM, Carlos Santana <neubyr at gmail.com > <mailto:neubyr at gmail.com>> wrote: > > Huh... :( Sorry to bug you guys again... > > I am planning to make a fresh start now as nothing seems to have > worked for me. If you have any comments/feedback please share them. > > I would like to confirm installation order before I make a fresh > start. From Arden''s experience: > http://lists.lustre.org/pipermail/lustre-discuss/2009-June/010710.html > , the lusre-module is installed last. As I was installing Lustre > 1.8, I was referring 1.8 operations manual > http://manual.lustre.org/index.php?title=Main_Page . The > installation order in the manual is different than what Arden has > suggested. > > Will it make a difference in configuration at later stage? Which > one should I follow now? > Any comments? > > Thanks, > CS. > > > On Wed, Jun 17, 2009 at 12:35 AM, Carlos Santana <neubyr at gmail.com > <mailto:neubyr at gmail.com>> wrote: > > Thanks Cliff. > > The depmod -a was successful before as well. I am using CentOS 5.2 > box. Following are the packages installed: > [root at localhost tmp]# rpm -qa | grep -i lustre > lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > > [root at localhost tmp]# uname -a > Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun 10 > 18:49:47 > EDT 2008 i686 i686 i386 GNU/Linux > > And here is a output from strace for mount: > http://www.heypasteit.com/clip/8WT > > Any further debugging hints? > > Thanks, > CS. > > On 6/16/09, Cliff White <Cliff.White at sun.com > <mailto:Cliff.White at sun.com>> wrote: > > Carlos Santana wrote: > >> The ''$ modprobe -l lustre*'' did not show any module on a > patchless > >> client. modprobe -v returns ''FATAL: Module lustre not found''. > >> > >> How do I install a patchless client? > >> I have tried lustre-client-modules and lustre-client-ver > rpm packages in > >> both sequences. Am I missing anything? > >> > > > > Make sure the lustre-client-modules package matches your > running kernel. > > Run depmod -a to be sure > > cliffw > > > >> Thanks, > >> CS. > >> > >> > >> > >> On Tue, Jun 16, 2009 at 2:28 PM, Cliff White > <Cliff.White at sun.com <mailto:Cliff.White at sun.com> > >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>> > wrote: > >> > >> Carlos Santana wrote: > >> > >> The lctlt ping and ''net up'' failed with the > following messages: > >> --- --- > >> [root at localhost ~]# lctl ping 10.0.0.42 > >> opening /dev/lnet failed: No such device > >> hint: the kernel modules may not be loaded > >> failed to ping 10.0.0.42 at tcp: No such device > >> > >> [root at localhost ~]# lctl network up > >> opening /dev/lnet failed: No such device > >> hint: the kernel modules may not be loaded > >> LNET configure error 19: No such device > >> > >> > >> Make sure modules are unloaded, then try modprobe -v. > >> Looks like you have lnet mis-configured, if your module > options are > >> wrong, you will see an error during the modprobe. > >> cliffw > >> > >> --- --- > >> > >> > >> I tried lustre_rmmod and depmod commands and it did > not return > >> any error messages. Any further clues? Reinstall > patchless > >> client again? > >> > >> - > >> CS. > >> > >> > >> On Tue, Jun 16, 2009 at 1:32 PM, Cliff White > >> <Cliff.White at sun.com <mailto:Cliff.White at sun.com> > <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>> > >> <mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com> <mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com>>>> wrote: > >> > >> Carlos Santana wrote: > >> > >> I was able to run lustre_rmmod and depmod > successfully. The > >> ''$lctl list_nids'' returned the server ip > address and > >> interface > >> (tcp0). > >> > >> I tried to mount the file system on a remote > client, but it > >> failed with the following message. > >> --- --- > >> [root at localhost ~]# mount -t lustre > 10.0.0.42 at tcp0:/lustre > >> /mnt/lustre > >> mount.lustre: mount 10.0.0.42 at tcp0:/lustre > at /mnt/lustre > >> failed: No such device > >> Are the lustre modules loaded? > >> Check /etc/modprobe.conf and /proc/filesystems > >> Note ''alias lustre llite'' should be removed from > >> modprobe.conf > >> --- --- > >> > >> However, the mounting is successful on a > single node > >> configuration - with client on the same > machine as MDS > >> and OST. > >> Any clues? Where to look for logs and debug > messages? > >> > >> > >> Syslog || /var/log/messages is the normal place. > >> > >> You can use ''lctl ping'' to verify that the > client can reach > >> the server. > >> Usually in these cases, it''s a network/name > misconfiguration. > >> > >> Run ''tunefs.lustre --print'' on your servers, and > verify that > >> mgsnode> >> is correct. > >> > >> cliffw > >> > >> > >> Thanks, > >> CS. > >> > >> > >> > >> > >> > >> On Tue, Jun 16, 2009 at 12:16 PM, Cliff White > >> <Cliff.White at sun.com > <mailto:Cliff.White at sun.com> <mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com>> > >> <mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com> <mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com>>> > >> <mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com> <mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com>> > >> <mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com> <mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com>>>>> wrote: > >> > >> Carlos Santana wrote: > >> > >> Thanks Kevin.. > >> > >> Please read: > >> > >> > >> > http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529 > >> > >> Those instructions are identical for 1.6 > and 1.8. > >> > >> For current lustre, only two commands are > used for > >> configuration. > >> mkfs.lustre and mount. > >> > >> > >> Usually when lustre_rmmod returns that > error, you run > >> it a second > >> time, and it will clear things. Unless > you have live > >> mounts or > >> network connections. > >> > >> cliffw > >> > >> > >> I am referring to 1.8 manual, but I > was also > >> referring to > >> HowTo > >> page on wiki which seems to be for > 1.6. The HowTo > >> page > >> > >> > >> > http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools > >> mentions abt lmc, lconf, and lctl. > >> > >> The modules are installed in the > right place. The ''$ > >> lustre_rmmod'' resulted in following o/p: > >> [root at localhost > 2.6.18-92.1.17.el5_lustre.1.8.0smp]# > >> lustre_rmmod > >> ERROR: Module obdfilter is in use > >> ERROR: Module ost is in use > >> ERROR: Module mds is in use > >> ERROR: Module fsfilt_ldiskfs is in use > >> ERROR: Module mgs is in use > >> ERROR: Module mgc is in use by mgs > >> ERROR: Module ldiskfs is in use by > fsfilt_ldiskfs > >> ERROR: Module lov is in use > >> ERROR: Module lquota is in use by > obdfilter,mds > >> ERROR: Module osc is in use > >> ERROR: Module ksocklnd is in use > >> ERROR: Module ptlrpc is in use by > >> obdfilter,ost,mds,mgs,mgc,lov,lquota,osc > >> ERROR: Module obdclass is in use by > >> > >> > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc > >> ERROR: Module lnet is in use by > >> ksocklnd,ptlrpc,obdclass > >> ERROR: Module lvfs is in use by > >> > >> > >> > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass > >> ERROR: Module libcfs is in use by > >> > >> > >> > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs > >> > >> Do I need to shutdown these services? > How can I do > >> that? > >> > >> Thanks, > >> CS. > >> > >> > >> On Tue, Jun 16, 2009 at 11:36 AM, > Kevin Van Maren > >> <Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com> > >> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com>> > <mailto:Kevin.Vanmaren at sun.com <mailto:Kevin.Vanmaren at sun.com> > >> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com>>> > >> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com> > >> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com>> > <mailto:Kevin.Vanmaren at sun.com <mailto:Kevin.Vanmaren at sun.com> > >> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com>>>> > >> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com> > >> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com>> > >> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com> > >> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com>>> > <mailto:Kevin.Vanmaren at sun.com <mailto:Kevin.Vanmaren at sun.com> > >> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com>> > >> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com> > >> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com>>>>>> > >> > >> wrote: > >> > >> I think lconf and lmc went away > with Lustre > >> 1.6. Are you > >> sure you > >> are looking at the 1.8 manual, and not > >> directions for 1.4? > >> > >> /usr/sbin/lctl should be in the > >> lustre-<version> RPM. > >> Do a: > >> # rpm -q -l > >> lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > >> > >> > >> Do make sure the modules are > installed in the > >> right place: > >> # cd /lib/modules/`uname -r` > >> # find . | grep lustre.ko > >> > >> If it shows up, then do: > >> # lustre_rmmod > >> # depmod > >> and try again. > >> > >> Otherwise, figure out where your > modules are > >> installed: > >> # uname -r > >> # cd /lib/modules > >> # find . | grep lustre.ko > >> > >> > >> You can also double-check the NID. > On the MSD > >> server, do > >> # lctl list_nids > >> > >> Should show 10.0.0.42 at tcp0 > >> > >> Kevin > >> > >> > >> > >> > >> > >> > ------------------------------------------------------------------------ > >> > >> _______________________________________________ > >> Lustre-discuss mailing list > >> Lustre-discuss at lists.lustre.org > <mailto:Lustre-discuss at lists.lustre.org> > >> <mailto:Lustre-discuss at lists.lustre.org > <mailto:Lustre-discuss at lists.lustre.org>> > >> <mailto:Lustre-discuss at lists.lustre.org > <mailto:Lustre-discuss at lists.lustre.org> > >> <mailto:Lustre-discuss at lists.lustre.org > <mailto:Lustre-discuss at lists.lustre.org>>> > >> > >> > http://lists.lustre.org/mailman/listinfo/lustre-discuss > >> > >> > >> > >> > >> > >> > ------------------------------------------------------------------------ > >> > >> _______________________________________________ > >> Lustre-discuss mailing list > >> Lustre-discuss at lists.lustre.org > <mailto:Lustre-discuss at lists.lustre.org> > >> <mailto:Lustre-discuss at lists.lustre.org > <mailto:Lustre-discuss at lists.lustre.org>> > >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > >> > >> > >> > >> > >> > ------------------------------------------------------------------------ > >> > >> _______________________________________________ > >> Lustre-discuss mailing list > >> Lustre-discuss at lists.lustre.org > <mailto:Lustre-discuss at lists.lustre.org> > >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >-------------- next part -------------- A non-text attachment was scrubbed... Name: hung-sheng_tsao.vcf Type: text/x-vcard Size: 377 bytes Desc: not available Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090617/e3be9d54/attachment.vcf
Cliff White
2009-Jun-17 19:31 UTC
[Lustre-discuss] Lustre installation and configuration problems
Carlos Santana wrote:> Thanks Cliff. > > The depmod -a was successful before as well. I am using CentOS 5.2 > box. Following are the packages installed: > [root at localhost tmp]# rpm -qa | grep -i lustre > lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smpThose are server modules. You would need to add lustre-kernel-smp for that to work For a client, you install the matching vendor kernel, then: lustre-client-modules lustre-client For a server, you need lustre-kernel-smp lustre-modules lustre- ldiskfs- And as others have mentioned in this thread, kernel version must match exactly. Check /lib/modules - if you have a mis-match, there will be an extra directory there. cliffw> > [root at localhost tmp]# uname -a > Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun 10 18:49:47 > EDT 2008 i686 i686 i386 GNU/Linux > > And here is a output from strace for mount: http://www.heypasteit.com/clip/8WT > > Any further debugging hints? > > Thanks, > CS. > > On 6/16/09, Cliff White <Cliff.White at sun.com> wrote: >> Carlos Santana wrote: >>> The ''$ modprobe -l lustre*'' did not show any module on a patchless >>> client. modprobe -v returns ''FATAL: Module lustre not found''. >>> >>> How do I install a patchless client? >>> I have tried lustre-client-modules and lustre-client-ver rpm packages in >>> both sequences. Am I missing anything? >>> >> Make sure the lustre-client-modules package matches your running kernel. >> Run depmod -a to be sure >> cliffw >> >>> Thanks, >>> CS. >>> >>> >>> >>> On Tue, Jun 16, 2009 at 2:28 PM, Cliff White <Cliff.White at sun.com >>> <mailto:Cliff.White at sun.com>> wrote: >>> >>> Carlos Santana wrote: >>> >>> The lctlt ping and ''net up'' failed with the following messages: >>> --- --- >>> [root at localhost ~]# lctl ping 10.0.0.42 >>> opening /dev/lnet failed: No such device >>> hint: the kernel modules may not be loaded >>> failed to ping 10.0.0.42 at tcp: No such device >>> >>> [root at localhost ~]# lctl network up >>> opening /dev/lnet failed: No such device >>> hint: the kernel modules may not be loaded >>> LNET configure error 19: No such device >>> >>> >>> Make sure modules are unloaded, then try modprobe -v. >>> Looks like you have lnet mis-configured, if your module options are >>> wrong, you will see an error during the modprobe. >>> cliffw >>> >>> --- --- >>> >>> >>> I tried lustre_rmmod and depmod commands and it did not return >>> any error messages. Any further clues? Reinstall patchless >>> client again? >>> >>> - >>> CS. >>> >>> >>> On Tue, Jun 16, 2009 at 1:32 PM, Cliff White >>> <Cliff.White at sun.com <mailto:Cliff.White at sun.com> >>> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>> wrote: >>> >>> Carlos Santana wrote: >>> >>> I was able to run lustre_rmmod and depmod successfully. The >>> ''$lctl list_nids'' returned the server ip address and >>> interface >>> (tcp0). >>> >>> I tried to mount the file system on a remote client, but it >>> failed with the following message. >>> --- --- >>> [root at localhost ~]# mount -t lustre 10.0.0.42 at tcp0:/lustre >>> /mnt/lustre >>> mount.lustre: mount 10.0.0.42 at tcp0:/lustre at /mnt/lustre >>> failed: No such device >>> Are the lustre modules loaded? >>> Check /etc/modprobe.conf and /proc/filesystems >>> Note ''alias lustre llite'' should be removed from >>> modprobe.conf >>> --- --- >>> >>> However, the mounting is successful on a single node >>> configuration - with client on the same machine as MDS >>> and OST. >>> Any clues? Where to look for logs and debug messages? >>> >>> >>> Syslog || /var/log/messages is the normal place. >>> >>> You can use ''lctl ping'' to verify that the client can reach >>> the server. >>> Usually in these cases, it''s a network/name misconfiguration. >>> >>> Run ''tunefs.lustre --print'' on your servers, and verify that >>> mgsnode>>> is correct. >>> >>> cliffw >>> >>> >>> Thanks, >>> CS. >>> >>> >>> >>> >>> >>> On Tue, Jun 16, 2009 at 12:16 PM, Cliff White >>> <Cliff.White at sun.com <mailto:Cliff.White at sun.com> >>> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>> >>> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com> >>> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>>> wrote: >>> >>> Carlos Santana wrote: >>> >>> Thanks Kevin.. >>> >>> Please read: >>> >>> >>> http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529 >>> >>> Those instructions are identical for 1.6 and 1.8. >>> >>> For current lustre, only two commands are used for >>> configuration. >>> mkfs.lustre and mount. >>> >>> >>> Usually when lustre_rmmod returns that error, you run >>> it a second >>> time, and it will clear things. Unless you have live >>> mounts or >>> network connections. >>> >>> cliffw >>> >>> >>> I am referring to 1.8 manual, but I was also >>> referring to >>> HowTo >>> page on wiki which seems to be for 1.6. The HowTo >>> page >>> >>> >>> http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools >>> mentions abt lmc, lconf, and lctl. >>> >>> The modules are installed in the right place. The ''$ >>> lustre_rmmod'' resulted in following o/p: >>> [root at localhost 2.6.18-92.1.17.el5_lustre.1.8.0smp]# >>> lustre_rmmod >>> ERROR: Module obdfilter is in use >>> ERROR: Module ost is in use >>> ERROR: Module mds is in use >>> ERROR: Module fsfilt_ldiskfs is in use >>> ERROR: Module mgs is in use >>> ERROR: Module mgc is in use by mgs >>> ERROR: Module ldiskfs is in use by fsfilt_ldiskfs >>> ERROR: Module lov is in use >>> ERROR: Module lquota is in use by obdfilter,mds >>> ERROR: Module osc is in use >>> ERROR: Module ksocklnd is in use >>> ERROR: Module ptlrpc is in use by >>> obdfilter,ost,mds,mgs,mgc,lov,lquota,osc >>> ERROR: Module obdclass is in use by >>> >>> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc >>> ERROR: Module lnet is in use by >>> ksocklnd,ptlrpc,obdclass >>> ERROR: Module lvfs is in use by >>> >>> >>> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass >>> ERROR: Module libcfs is in use by >>> >>> >>> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs >>> >>> Do I need to shutdown these services? How can I do >>> that? >>> >>> Thanks, >>> CS. >>> >>> >>> On Tue, Jun 16, 2009 at 11:36 AM, Kevin Van Maren >>> <Kevin.Vanmaren at sun.com >>> <mailto:Kevin.Vanmaren at sun.com> <mailto:Kevin.Vanmaren at sun.com >>> <mailto:Kevin.Vanmaren at sun.com>> >>> <mailto:Kevin.Vanmaren at sun.com >>> <mailto:Kevin.Vanmaren at sun.com> <mailto:Kevin.Vanmaren at sun.com >>> <mailto:Kevin.Vanmaren at sun.com>>> >>> <mailto:Kevin.Vanmaren at sun.com >>> <mailto:Kevin.Vanmaren at sun.com> >>> <mailto:Kevin.Vanmaren at sun.com >>> <mailto:Kevin.Vanmaren at sun.com>> <mailto:Kevin.Vanmaren at sun.com >>> <mailto:Kevin.Vanmaren at sun.com> >>> <mailto:Kevin.Vanmaren at sun.com >>> <mailto:Kevin.Vanmaren at sun.com>>>>> >>> >>> wrote: >>> >>> I think lconf and lmc went away with Lustre >>> 1.6. Are you >>> sure you >>> are looking at the 1.8 manual, and not >>> directions for 1.4? >>> >>> /usr/sbin/lctl should be in the >>> lustre-<version> RPM. >>> Do a: >>> # rpm -q -l >>> lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp >>> >>> >>> Do make sure the modules are installed in the >>> right place: >>> # cd /lib/modules/`uname -r` >>> # find . | grep lustre.ko >>> >>> If it shows up, then do: >>> # lustre_rmmod >>> # depmod >>> and try again. >>> >>> Otherwise, figure out where your modules are >>> installed: >>> # uname -r >>> # cd /lib/modules >>> # find . | grep lustre.ko >>> >>> >>> You can also double-check the NID. On the MSD >>> server, do >>> # lctl list_nids >>> >>> Should show 10.0.0.42 at tcp0 >>> >>> Kevin >>> >>> >>> >>> >>> >>> ------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> Lustre-discuss mailing list >>> Lustre-discuss at lists.lustre.org >>> <mailto:Lustre-discuss at lists.lustre.org> >>> <mailto:Lustre-discuss at lists.lustre.org >>> <mailto:Lustre-discuss at lists.lustre.org>> >>> >>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>> >>> >>> >>> >>> >>> ------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> Lustre-discuss mailing list >>> Lustre-discuss at lists.lustre.org >>> <mailto:Lustre-discuss at lists.lustre.org> >>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>> >>> >>> >>> >>> ------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> Lustre-discuss mailing list >>> Lustre-discuss at lists.lustre.org >>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
Cliff White
2009-Jun-17 19:40 UTC
[Lustre-discuss] Lustre installation and configuration problems
Carlos Santana wrote:> Huh... :( Sorry to bug you guys again... > > I am planning to make a fresh start now as nothing seems to have worked > for me. If you have any comments/feedback please share them. > > I would like to confirm installation order before I make a fresh start. > From Arden''s experience: > http://lists.lustre.org/pipermail/lustre-discuss/2009-June/010710.html , > the lusre-module is installed last. As I was installing Lustre 1.8, I > was referring 1.8 operations manual > http://manual.lustre.org/index.php?title=Main_Page . The installation > order in the manual is different than what Arden has suggested. > > Will it make a difference in configuration at later stage? Which one > should I follow now? > Any comments?RPM installation order really doesn''t matter. If you install in the ''wrong'' order you will get a lot of warnings from RPM due to the relationship of the various RPMs. But these are harmless - whatever order you install in, it should work fine. cliffw> > Thanks, > CS. > > > On Wed, Jun 17, 2009 at 12:35 AM, Carlos Santana <neubyr at gmail.com > <mailto:neubyr at gmail.com>> wrote: > > Thanks Cliff. > > The depmod -a was successful before as well. I am using CentOS 5.2 > box. Following are the packages installed: > [root at localhost tmp]# rpm -qa | grep -i lustre > lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > > [root at localhost tmp]# uname -a > Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun 10 18:49:47 > EDT 2008 i686 i686 i386 GNU/Linux > > And here is a output from strace for mount: > http://www.heypasteit.com/clip/8WT > > Any further debugging hints? > > Thanks, > CS. > > On 6/16/09, Cliff White <Cliff.White at sun.com > <mailto:Cliff.White at sun.com>> wrote: > > Carlos Santana wrote: > >> The ''$ modprobe -l lustre*'' did not show any module on a patchless > >> client. modprobe -v returns ''FATAL: Module lustre not found''. > >> > >> How do I install a patchless client? > >> I have tried lustre-client-modules and lustre-client-ver rpm > packages in > >> both sequences. Am I missing anything? > >> > > > > Make sure the lustre-client-modules package matches your running > kernel. > > Run depmod -a to be sure > > cliffw > > > >> Thanks, > >> CS. > >> > >> > >> > >> On Tue, Jun 16, 2009 at 2:28 PM, Cliff White > <Cliff.White at sun.com <mailto:Cliff.White at sun.com> > >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>> wrote: > >> > >> Carlos Santana wrote: > >> > >> The lctlt ping and ''net up'' failed with the following > messages: > >> --- --- > >> [root at localhost ~]# lctl ping 10.0.0.42 > >> opening /dev/lnet failed: No such device > >> hint: the kernel modules may not be loaded > >> failed to ping 10.0.0.42 at tcp: No such device > >> > >> [root at localhost ~]# lctl network up > >> opening /dev/lnet failed: No such device > >> hint: the kernel modules may not be loaded > >> LNET configure error 19: No such device > >> > >> > >> Make sure modules are unloaded, then try modprobe -v. > >> Looks like you have lnet mis-configured, if your module > options are > >> wrong, you will see an error during the modprobe. > >> cliffw > >> > >> --- --- > >> > >> > >> I tried lustre_rmmod and depmod commands and it did not > return > >> any error messages. Any further clues? Reinstall patchless > >> client again? > >> > >> - > >> CS. > >> > >> > >> On Tue, Jun 16, 2009 at 1:32 PM, Cliff White > >> <Cliff.White at sun.com <mailto:Cliff.White at sun.com> > <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>> > >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com> > <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>>> wrote: > >> > >> Carlos Santana wrote: > >> > >> I was able to run lustre_rmmod and depmod > successfully. The > >> ''$lctl list_nids'' returned the server ip address and > >> interface > >> (tcp0). > >> > >> I tried to mount the file system on a remote > client, but it > >> failed with the following message. > >> --- --- > >> [root at localhost ~]# mount -t lustre > 10.0.0.42 at tcp0:/lustre > >> /mnt/lustre > >> mount.lustre: mount 10.0.0.42 at tcp0:/lustre at > /mnt/lustre > >> failed: No such device > >> Are the lustre modules loaded? > >> Check /etc/modprobe.conf and /proc/filesystems > >> Note ''alias lustre llite'' should be removed from > >> modprobe.conf > >> --- --- > >> > >> However, the mounting is successful on a single node > >> configuration - with client on the same machine > as MDS > >> and OST. > >> Any clues? Where to look for logs and debug messages? > >> > >> > >> Syslog || /var/log/messages is the normal place. > >> > >> You can use ''lctl ping'' to verify that the client can > reach > >> the server. > >> Usually in these cases, it''s a network/name > misconfiguration. > >> > >> Run ''tunefs.lustre --print'' on your servers, and > verify that > >> mgsnode> >> is correct. > >> > >> cliffw > >> > >> > >> Thanks, > >> CS. > >> > >> > >> > >> > >> > >> On Tue, Jun 16, 2009 at 12:16 PM, Cliff White > >> <Cliff.White at sun.com <mailto:Cliff.White at sun.com> > <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>> > >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com> > <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>> > >> <mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com> <mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com>> > >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com> > <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>>>> wrote: > >> > >> Carlos Santana wrote: > >> > >> Thanks Kevin.. > >> > >> Please read: > >> > >> > >> > http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529 > >> > >> Those instructions are identical for 1.6 and 1.8. > >> > >> For current lustre, only two commands are used for > >> configuration. > >> mkfs.lustre and mount. > >> > >> > >> Usually when lustre_rmmod returns that error, > you run > >> it a second > >> time, and it will clear things. Unless you > have live > >> mounts or > >> network connections. > >> > >> cliffw > >> > >> > >> I am referring to 1.8 manual, but I was also > >> referring to > >> HowTo > >> page on wiki which seems to be for 1.6. > The HowTo > >> page > >> > >> > >> > http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools > >> mentions abt lmc, lconf, and lctl. > >> > >> The modules are installed in the right > place. The ''$ > >> lustre_rmmod'' resulted in following o/p: > >> [root at localhost > 2.6.18-92.1.17.el5_lustre.1.8.0smp]# > >> lustre_rmmod > >> ERROR: Module obdfilter is in use > >> ERROR: Module ost is in use > >> ERROR: Module mds is in use > >> ERROR: Module fsfilt_ldiskfs is in use > >> ERROR: Module mgs is in use > >> ERROR: Module mgc is in use by mgs > >> ERROR: Module ldiskfs is in use by > fsfilt_ldiskfs > >> ERROR: Module lov is in use > >> ERROR: Module lquota is in use by > obdfilter,mds > >> ERROR: Module osc is in use > >> ERROR: Module ksocklnd is in use > >> ERROR: Module ptlrpc is in use by > >> obdfilter,ost,mds,mgs,mgc,lov,lquota,osc > >> ERROR: Module obdclass is in use by > >> > >> > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc > >> ERROR: Module lnet is in use by > >> ksocklnd,ptlrpc,obdclass > >> ERROR: Module lvfs is in use by > >> > >> > >> > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass > >> ERROR: Module libcfs is in use by > >> > >> > >> > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs > >> > >> Do I need to shutdown these services? How > can I do > >> that? > >> > >> Thanks, > >> CS. > >> > >> > >> On Tue, Jun 16, 2009 at 11:36 AM, Kevin > Van Maren > >> <Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com> > >> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com>> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com> > >> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com>>> > >> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com> > >> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com>> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com> > >> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com>>>> > >> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com> > >> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com>> > >> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com> > >> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com>>> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com> > >> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com>> > >> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com> > >> <mailto:Kevin.Vanmaren at sun.com > <mailto:Kevin.Vanmaren at sun.com>>>>>> > >> > >> wrote: > >> > >> I think lconf and lmc went away with Lustre > >> 1.6. Are you > >> sure you > >> are looking at the 1.8 manual, and not > >> directions for 1.4? > >> > >> /usr/sbin/lctl should be in the > >> lustre-<version> RPM. > >> Do a: > >> # rpm -q -l > >> lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > >> > >> > >> Do make sure the modules are installed > in the > >> right place: > >> # cd /lib/modules/`uname -r` > >> # find . | grep lustre.ko > >> > >> If it shows up, then do: > >> # lustre_rmmod > >> # depmod > >> and try again. > >> > >> Otherwise, figure out where your > modules are > >> installed: > >> # uname -r > >> # cd /lib/modules > >> # find . | grep lustre.ko > >> > >> > >> You can also double-check the NID. On > the MSD > >> server, do > >> # lctl list_nids > >> > >> Should show 10.0.0.42 at tcp0 > >> > >> Kevin > >> > >> > >> > >> > >> > >> > ------------------------------------------------------------------------ > >> > >> _______________________________________________ > >> Lustre-discuss mailing list > >> Lustre-discuss at lists.lustre.org > <mailto:Lustre-discuss at lists.lustre.org> > >> <mailto:Lustre-discuss at lists.lustre.org > <mailto:Lustre-discuss at lists.lustre.org>> > >> <mailto:Lustre-discuss at lists.lustre.org > <mailto:Lustre-discuss at lists.lustre.org> > >> <mailto:Lustre-discuss at lists.lustre.org > <mailto:Lustre-discuss at lists.lustre.org>>> > >> > >> > http://lists.lustre.org/mailman/listinfo/lustre-discuss > >> > >> > >> > >> > >> > >> > ------------------------------------------------------------------------ > >> > >> _______________________________________________ > >> Lustre-discuss mailing list > >> Lustre-discuss at lists.lustre.org > <mailto:Lustre-discuss at lists.lustre.org> > >> <mailto:Lustre-discuss at lists.lustre.org > <mailto:Lustre-discuss at lists.lustre.org>> > >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > >> > >> > >> > >> > >> > ------------------------------------------------------------------------ > >> > >> _______________________________________________ > >> Lustre-discuss mailing list > >> Lustre-discuss at lists.lustre.org > <mailto:Lustre-discuss at lists.lustre.org> > >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
Arden Wiebe
2009-Jun-17 20:05 UTC
[Lustre-discuss] Lustre installation and configuration problems
Carlos: Now that the obvious clue has been sleuthed out and identified the villainous depreciated kernel installation media can be destroyed. That should come in whatever form you feel appropriate from the good old Frisbee and forget or the always popular coaster contemplation collection. The order doesn''t matter that much - aside from correct kernel first. What matters is the thoughtful message "Are the Modules Loaded?" If your getting it you have missed installing one of the packages. When all else fails remove and reinstall or even force as the case may be sometimes with the e2fsprogs. This becomes quite a chore when your installing on more then two computers. What is needed is a bare bones Lustre installation dvd iso. I''m sure Brian plugged the one Sun offers in another post and for a fact one University runs and develops in house their own distribution that would be very interesting to obtain but it''s not public. Good luck Carlos and be sure to have plenty of inodes! Arden --- On Wed, 6/17/09, Jerome, Ron <Ron.Jerome at nrc-cnrc.gc.ca> wrote:> From: Jerome, Ron <Ron.Jerome at nrc-cnrc.gc.ca> > Subject: Re: [Lustre-discuss] Lustre installation and configuration problems > To: "Carlos Santana" <neubyr at gmail.com> > Cc: lustre-discuss at lists.lustre.org > Date: Wednesday, June 17, 2009, 8:40 AM > > > > > > > > > > > > > > > > I think the problem you have, > as Cliff alluded to, is a mismatch > between your kernel version ?and the Luster kernel > version modules.? > > ? > > You have kernel > ?2.6.18-92.el5? and are > installing Lustre > ?2.6.18_92.1.17.el5??? Note the > ?.1.17? > is significant as the modules will end up in the wrong > directory.? > There is an update to CentOS to bring the kernel to the > matching 2.6.18_92.1.17.el5 > version you can pull it off the CentOS mirror site in > the updates directory. > > ? > > ? > > Ron. > > ? > > > > > > > > From: > lustre-discuss-bounces at lists.lustre.org > [mailto:lustre-discuss-bounces at lists.lustre.org] On > Behalf Of Carlos > Santana > > Sent: June 17, 2009 11:21 AM > > To: lustre-discuss at lists.lustre.org > > Subject: Re: [Lustre-discuss] Lustre installation > and configuration > problems > > > > > > ? > > And is > there any specific > installation order for patchless client? Could someone > please share it with me? > > > > > - > > CS. > > > > On Wed, Jun 17, 2009 at 10:18 AM, > Carlos Santana <neubyr at gmail.com> > wrote: > > Huh... :( Sorry to bug you guys > again... > > > > I am planning to make a fresh start now as nothing seems to > have worked for me. > If you have any comments/feedback please share them. > > > > I would like to confirm installation order before I make a > fresh start. From > Arden''s experience: http://lists.lustre.org/pipermail/lustre-discuss/2009-June/010710.html > , the lusre-module is installed last. As I was installing > Lustre 1.8, I was > referring 1.8 operations manual http://manual.lustre.org/index.php?title=Main_Page > . The installation order in the manual is different than > what Arden has > suggested. > > > > Will it make a difference in configuration at later stage? > Which one should I > follow now? > > Any comments? > > > > Thanks, > > CS. > > > > > > > ? > > > > On Wed, Jun 17, 2009 at 12:35 AM, > Carlos Santana <neubyr at gmail.com> > wrote: > > Thanks Cliff. > > > > The depmod -a was successful before as well. I am using > CentOS 5.2 > > box. Following are the packages installed: > > [root at localhost tmp]# rpm -qa | grep -i lustre > > lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > > > > > lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > > > > > [root at localhost tmp]# uname -a > > > > Linux > localhost.localdomain > 2.6.18-92.el5 #1 SMP Tue Jun 10 18:49:47 > > EDT 2008 i686 i686 i386 GNU/Linux > > > > And here is a output from strace for > mount: http://www.heypasteit.com/clip/8WT > > > > Any further debugging hints? > > > > Thanks, > > CS. > > > > > > > > On 6/16/09, Cliff White <Cliff.White at sun.com> > wrote: > > > Carlos Santana wrote: > > >> The ''$ modprobe -l lustre*'' did not show > any module on a patchless > > >> client. modprobe -v returns ''FATAL: Module > lustre not found''. > > >> > > >> How do I install a patchless client? > > >> I have tried lustre-client-modules and > lustre-client-ver rpm packages > in > > >> both sequences. Am I missing anything? > > >> > > > > > > Make sure the lustre-client-modules package matches > your running kernel. > > > Run depmod -a to be sure > > > cliffw > > > > > >> Thanks, > > >> CS. > > >> > > >> > > >> > > >> On Tue, Jun 16, 2009 at 2:28 PM, Cliff White > <Cliff.White at sun.com > > >> <mailto:Cliff.White at sun.com>> > wrote: > > >> > > >> ? ? Carlos Santana wrote: > > >> > > >> ? ? ? ? The lctlt ping and > ''net up'' failed with > the following messages: > > >> ? ? ? ? --- --- > > >> ? ? ? ? [root at localhost ~]# > lctl ping 10.0.0.42 > > >> ? ? ? ? opening /dev/lnet > failed: No such device > > >> ? ? ? ? hint: the kernel > modules may not be loaded > > >> ? ? ? ? failed to ping > 10.0.0.42 at tcp: No such > device > > >> > > >> ? ? ? ? [root at localhost ~]# > lctl network up > > >> ? ? ? ? opening /dev/lnet > failed: No such device > > >> ? ? ? ? hint: the kernel > modules may not be loaded > > >> ? ? ? ? LNET configure error > 19: No such device > > >> > > >> > > >> ? ? Make sure modules are unloaded, then > try modprobe -v. > > >> ? ? Looks like you have lnet > mis-configured, if your module > options are > > >> ? ? wrong, you will see an error during > the modprobe. > > >> ? ? cliffw > > >> > > >> ? ? ? ? --- --- > > >> > > >> > > >> ? ? ? ? I tried lustre_rmmod > and depmod commands > and it did not return > > >> ? ? ? ? any error messages. > Any further clues? > Reinstall patchless > > >> ? ? ? ? client again? > > >> > > >> ? ? ? ? - > > >> ? ? ? ? CS. > > >> > > >> > > >> ? ? ? ? On Tue, Jun 16, 2009 > at 1:32 PM, Cliff > White > > >> ? ? ? ? <Cliff.White at sun.com <mailto:Cliff.White at sun.com> > > >> ? ? ? ? <mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com>>> > wrote: > > >> > > >> ? ? ? ? ? ?Carlos > Santana wrote: > > >> > > >> ? ? ? ? ? ? ? > ?I was able to > run lustre_rmmod and depmod successfully. The > > >> ? ? ? ? ? ? ? > ?''$lctl > list_nids'' returned the server ip address and > > >> ? ? ? ? interface > > >> ? ? ? ? ? ? ? > ?(tcp0). > > >> > > >> ? ? ? ? ? ? ? > ?I tried to > mount the file system on a remote client, but it > > >> ? ? ? ? ? ? ? > ?failed with the > following message. > > >> ? ? ? ? ? ? ? > ?--- --- > > >> ? ? ? ? ? ? ? > ?[root at localhost > ~]# mount -t lustre 10.0.0.42 at tcp0:/lustre > > >> ? ? ? ? ? ? ? > ?/mnt/lustre > > >> ? ? ? ? ? ? ? > ?mount.lustre: > mount 10.0.0.42 at tcp0:/lustre at /mnt/lustre > > >> ? ? ? ? ? ? ? > ?failed: No such > device > > >> ? ? ? ? ? ? ? > ?Are the lustre > modules loaded? > > >> ? ? ? ? ? ? ? > ?Check > /etc/modprobe.conf and /proc/filesystems > > >> ? ? ? ? ? ? ? > ?Note ''alias > lustre llite'' should be removed from > > >> ? ? ? ? modprobe.conf > > >> ? ? ? ? ? ? ? > ?--- --- > > >> > > >> ? ? ? ? ? ? ? > ?However, the > mounting is successful on a single node > > >> ? ? ? ? ? ? ? > ?configuration - > with client on the same machine as MDS > > >> ? ? ? ? and OST. > > >> ? ? ? ? ? ? ? > ?Any clues? > Where to look for logs and debug messages? > > >> > > >> > > >> ? ? ? ? ? ?Syslog || > /var/log/messages > is the normal place. > > >> > > >> ? ? ? ? ? ?You can > use ''lctl ping'' to > verify that the client can reach > > >> ? ? ? ? the server. > > >> ? ? ? ? ? ?Usually > in these cases, it''s > a network/name misconfiguration. > > >> > > >> ? ? ? ? ? ?Run > ''tunefs.lustre --print'' > on your servers, and verify that > > >> ? ? ? ? mgsnode> > >> ? ? ? ? ? ?is > correct. > > >> > > >> ? ? ? ? ? ?cliffw > > >> > > >> > > >> ? ? ? ? ? ? ? > ?Thanks, > > >> ? ? ? ? ? ? ? > ?CS. > > >> > > >> > > >> > > >> > > >> > > >> ? ? ? ? ? ? ? > ?On Tue, Jun 16, > 2009 at 12:16 PM, Cliff White > > >> ? ? ? ? ? ? ? > ?<Cliff.White at sun.com > <mailto:Cliff.White at sun.com> > > >> ? ? ? ? <mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com>> > > >> ? ? ? ? ? ? ? > ?<mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com> > > >> ? ? ? ? <mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com>>>> > wrote: > > >> > > >> ? ? ? ? ? ? ? > ? ? Carlos > Santana wrote: > > >> > > >> ? ? ? ? ? ? ? > ? ? ? > ? Thanks Kevin.. > > >> > > >> ? ? ? ? ? ? ? > ? ? Please > read: > > >> > > >> > > >> http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529 > > >> > > >> ? ? ? ? ? ? ? > ? ? Those > instructions are identical for 1.6 and 1.8. > > >> > > >> ? ? ? ? ? ? ? > ? ? For > current lustre, only two commands are used for > > >> ? ? ? ? configuration. > > >> ? ? ? ? ? ? ? > ? ? > mkfs.lustre and mount. > > >> > > >> > > >> ? ? ? ? ? ? ? > ? ? Usually > when lustre_rmmod returns that error, you run > > >> ? ? ? ? it a second > > >> ? ? ? ? ? ? ? > ? ? time, > and it will clear things. Unless you have live > > >> ? ? ? ? mounts or > > >> ? ? ? ? ? ? ? > ? ? network > connections. > > >> > > >> ? ? ? ? ? ? ? > ? ? cliffw > > >> > > >> > > >> ? ? ? ? ? ? ? > ? ? ? > ? I am referring to 1.8 manual, but I was also > > >> ? ? ? ? referring to > > >> ? ? ? ? ? ? ? > ?HowTo > > >> ? ? ? ? ? ? ? > ? ? ? > ? page on wiki which seems to be for 1.6. The HowTo > > >> page > > >> > > >> > > >> http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools > > >> ? ? ? ? ? ? ? > ? ? ? > ? mentions abt lmc, lconf, and lctl. > > >> > > >> ? ? ? ? ? ? ? > ? ? ? > ? The modules are installed in the right place. The > ''$ > > >> ? ? ? ? ? ? ? > ? ? ? > ? lustre_rmmod'' resulted in following o/p: > > >> ? ? ? ? ? ? ? > ? ? ? > ? [root at localhost > 2.6.18-92.1.17.el5_lustre.1.8.0smp]# > > >> ? ? ? ? ? ? ? > ?lustre_rmmod > > >> ? ? ? ? ? ? ? > ? ? ? > ? ERROR: Module obdfilter is in use > > >> ? ? ? ? ? ? ? > ? ? ? > ? ERROR: Module ost is in use > > >> ? ? ? ? ? ? ? > ? ? ? > ? ERROR: Module mds is in use > > >> ? ? ? ? ? ? ? > ? ? ? > ? ERROR: Module fsfilt_ldiskfs is in use > > >> ? ? ? ? ? ? ? > ? ? ? > ? ERROR: Module mgs is in use > > >> ? ? ? ? ? ? ? > ? ? ? > ? ERROR: Module mgc is in use by mgs > > >> ? ? ? ? ? ? ? > ? ? ? > ? ERROR: Module ldiskfs is in use by fsfilt_ldiskfs > > >> ? ? ? ? ? ? ? > ? ? ? > ? ERROR: Module lov is in use > > >> ? ? ? ? ? ? ? > ? ? ? > ? ERROR: Module lquota is in use by obdfilter,mds > > >> ? ? ? ? ? ? ? > ? ? ? > ? ERROR: Module osc is in use > > >> ? ? ? ? ? ? ? > ? ? ? > ? ERROR: Module ksocklnd is in use > > >> ? ? ? ? ? ? ? > ? ? ? > ? ERROR: Module ptlrpc is in use by > > >> ? ? ? ? ? ? ? > ? ? ? > ? obdfilter,ost,mds,mgs,mgc,lov,lquota,osc > > >> ? ? ? ? ? ? ? > ? ? ? > ? ERROR: Module obdclass is in use by > > >> > > >> ? ? ? ? > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc > > >> ? ? ? ? ? ? ? > ? ? ? > ? ERROR: Module lnet is in use by > > >> ? ? ? ? > ksocklnd,ptlrpc,obdclass > > >> ? ? ? ? ? ? ? > ? ? ? > ? ERROR: Module lvfs is in use by > > >> > > >> > > >> > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass > > >> ? ? ? ? ? ? ? > ? ? ? > ? ERROR: Module libcfs is in use by > > >> > > >> > > >> > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs > > >> > > >> ? ? ? ? ? ? ? > ? ? ? > ? Do I need to shutdown these services? How can I do > > >> ? ? ? ? that? > > >> > > >> ? ? ? ? ? ? ? > ? ? ? > ? Thanks, > > >> ? ? ? ? ? ? ? > ? ? ? > ? CS. > > >> > > >> > > >> ? ? ? ? ? ? ? > ? ? ? > ? On Tue, Jun 16, 2009 at 11:36 AM, Kevin Van Maren > > >> ? ? ? ? ? ? ? > ? ? ? > ? <Kevin.Vanmaren at sun.com > > >> ? ? ? ? <mailto:Kevin.Vanmaren at sun.com> > <mailto:Kevin.Vanmaren at sun.com > > >> ? ? ? ? <mailto:Kevin.Vanmaren at sun.com>> > > >> ? ? ? ? ? ? ? > ?<mailto:Kevin.Vanmaren at sun.com > > >> ? ? ? ? <mailto:Kevin.Vanmaren at sun.com> > <mailto:Kevin.Vanmaren at sun.com > > >> ? ? ? ? <mailto:Kevin.Vanmaren at sun.com>>> > > >> ? ? ? ? ? ? ? > ? ? ? > ? <mailto:Kevin.Vanmaren at sun.com > > >> ? ? ? ? <mailto:Kevin.Vanmaren at sun.com> > > >> ? ? ? ? ? ? ? > ?<mailto:Kevin.Vanmaren at sun.com > > >> ? ? ? ? <mailto:Kevin.Vanmaren at sun.com>> > <mailto:Kevin.Vanmaren at sun.com > > >> ? ? ? ? <mailto:Kevin.Vanmaren at sun.com> > > >> ? ? ? ? ? ? ? > ?<mailto:Kevin.Vanmaren at sun.com > > >> ? ? ? ? <mailto:Kevin.Vanmaren at sun.com>>>>> > > >> > > >> ? ? ? ? ? ? ? > ? ? ? > ? wrote: > > >> > > >> ? ? ? ? ? ? ? > ? ? ? > ? ? ?I think lconf and lmc went away with > Lustre > > >> ? ? ? ? 1.6. ?Are you > > >> ? ? ? ? ? ? ? > ? ? ? > ? sure you > > >> ? ? ? ? ? ? ? > ? ? ? > ? ? ?are looking at the 1.8 manual, and not > > >> ? ? ? ? directions for 1.4? > > >> > > >> ? ? ? ? ? ? ? > ? ? ? > ? ? ?/usr/sbin/lctl should be in the > > >> ? ? ? ? lustre-<version> > RPM. > > >> ? ? ? ? ? ? ? > ? Do a: > > >> ? ? ? ? ? ? ? > ? ? ? > ? ? ?# rpm -q -l > > >> ? ? ? ? ? ? ? > ?lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > > >> > > >> > > >> ? ? ? ? ? ? ? > ? ? ? > ? ? ?Do make sure the modules are installed > in the > > >> ? ? ? ? right place: > > >> ? ? ? ? ? ? ? > ? ? ? > ? ? ?# cd /lib/modules/`uname -r` > > >> ? ? ? ? ? ? ? > ? ? ? > ? ? ?# find . | grep lustre.ko > > >> > > >> ? ? ? ? ? ? ? > ? ? ? > ? ? ?If it shows up, then do: > > >> ? ? ? ? ? ? ? > ? ? ? > ? ? ?# lustre_rmmod > > >> ? ? ? ? ? ? ? > ? ? ? > ? ? ?# depmod > > >> ? ? ? ? ? ? ? > ? ? ? > ? ? ?and try again. > > >> > > >> ? ? ? ? ? ? ? > ? ? ? > ? ? ?Otherwise, figure out where your > modules are > > >> ? ? ? ? installed: > > >> ? ? ? ? ? ? ? > ? ? ? > ? ? ?# uname -r > > >> ? ? ? ? ? ? ? > ? ? ? > ? ? ?# cd /lib/modules > > >> ? ? ? ? ? ? ? > ? ? ? > ? ? ?# find . | grep lustre.ko > > >> > > >> > > >> ? ? ? ? ? ? ? > ? ? ? > ? ? ?You can also double-check the NID. > ?On the MSD > > >> ? ? ? ? server, do > > >> ? ? ? ? ? ? ? > ? ? ? > ? ? ?# lctl list_nids > > >> > > >> ? ? ? ? ? ? ? > ? ? ? > ? ? ?Should show 10.0.0.42 at tcp0 > > >> > > >> ? ? ? ? ? ? ? > ? ? ? > ? ? ?Kevin > > >> > > >> > > >> > > >> > > >> > > >> > ------------------------------------------------------------------------ > > >> > > >> ? ? ? ? ? ? ? > ?_______________________________________________ > > >> ? ? ? ? ? ? ? > ?Lustre-discuss > mailing list > > >> ? ? ? ? ? ? ? > ?Lustre-discuss at lists.lustre.org > > >> ? ? ? ? <mailto:Lustre-discuss at lists.lustre.org> > > >> ? ? ? ? ? ? ? > ?<mailto:Lustre-discuss at lists.lustre.org > > >> ? ? ? ? <mailto:Lustre-discuss at lists.lustre.org>> > > >> > > >> ? ? ? ? ? ? ? > ?http://lists.lustre.org/mailman/listinfo/lustre-discuss > > >> > > >> > > >> > > >> > > >> > > >> > ------------------------------------------------------------------------ > > >> > > >> ? ? ? ? > _______________________________________________ > > >> ? ? ? ? Lustre-discuss mailing > list > > >> ? ? ? ? Lustre-discuss at lists.lustre.org > > >> ? ? ? ? <mailto:Lustre-discuss at lists.lustre.org> > > >> ? ? ? ? http://lists.lustre.org/mailman/listinfo/lustre-discuss > > >> > > >> > > >> > > >> > > >> > ------------------------------------------------------------------------ > > >> > > >> _______________________________________________ > > >> Lustre-discuss mailing list > > >> Lustre-discuss at lists.lustre.org > > >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > > > > > > > > > > > ? > > > > > > > > ? > > > > > > > > > > -----Inline Attachment Follows----- > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >
Sheila Barthel
2009-Jun-17 20:08 UTC
[Lustre-discuss] Lustre installation and configuration problems
Carlos - The installation procedures for Lustre 1.6 and 1.8 are the same. The manual''s installation procedure includes a table that shows which packages to install on servers and clients (I''ve attached a PDF of the table). The procedure also describes the installation order for packages (kernel, modules, ldiskfs, then utilities/userspace, then e2fsprogs). http://manual.lustre.org/manual/LustreManual16_HTML/LustreInstallation.html#50401389_pgfId-1291574 Sheila Cliff White wrote:> Carlos Santana wrote: > >> Huh... :( Sorry to bug you guys again... >> >> I am planning to make a fresh start now as nothing seems to have worked >> for me. If you have any comments/feedback please share them. >> >> I would like to confirm installation order before I make a fresh start. >> From Arden''s experience: >> http://lists.lustre.org/pipermail/lustre-discuss/2009-June/010710.html , >> the lusre-module is installed last. As I was installing Lustre 1.8, I >> was referring 1.8 operations manual >> http://manual.lustre.org/index.php?title=Main_Page . The installation >> order in the manual is different than what Arden has suggested. >> >> Will it make a difference in configuration at later stage? Which one >> should I follow now? >> Any comments? >> > > RPM installation order really doesn''t matter. If you install in the > ''wrong'' order you will get a lot of warnings from RPM due to the > relationship of the various RPMs. But these are harmless - whatever > order you install in, it should work fine. > cliffw > >> Thanks, >> CS. >> >> >> On Wed, Jun 17, 2009 at 12:35 AM, Carlos Santana <neubyr at gmail.com >> <mailto:neubyr at gmail.com>> wrote: >> >> Thanks Cliff. >> >> The depmod -a was successful before as well. I am using CentOS 5.2 >> box. Following are the packages installed: >> [root at localhost tmp]# rpm -qa | grep -i lustre >> lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp >> lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp >> >> [root at localhost tmp]# uname -a >> Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun 10 18:49:47 >> EDT 2008 i686 i686 i386 GNU/Linux >> >> And here is a output from strace for mount: >> http://www.heypasteit.com/clip/8WT >> >> Any further debugging hints? >> >> Thanks, >> CS. >> >> On 6/16/09, Cliff White <Cliff.White at sun.com >> <mailto:Cliff.White at sun.com>> wrote: >> > Carlos Santana wrote: >> >> The ''$ modprobe -l lustre*'' did not show any module on a patchless >> >> client. modprobe -v returns ''FATAL: Module lustre not found''. >> >> >> >> How do I install a patchless client? >> >> I have tried lustre-client-modules and lustre-client-ver rpm >> packages in >> >> both sequences. Am I missing anything? >> >> >> > >> > Make sure the lustre-client-modules package matches your running >> kernel. >> > Run depmod -a to be sure >> > cliffw >> > >> >> Thanks, >> >> CS. >> >> >> >> >> >> >> >> On Tue, Jun 16, 2009 at 2:28 PM, Cliff White >> <Cliff.White at sun.com <mailto:Cliff.White at sun.com> >> >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>> wrote: >> >> >> >> Carlos Santana wrote: >> >> >> >> The lctlt ping and ''net up'' failed with the following >> messages: >> >> --- --- >> >> [root at localhost ~]# lctl ping 10.0.0.42 >> >> opening /dev/lnet failed: No such device >> >> hint: the kernel modules may not be loaded >> >> failed to ping 10.0.0.42 at tcp: No such device >> >> >> >> [root at localhost ~]# lctl network up >> >> opening /dev/lnet failed: No such device >> >> hint: the kernel modules may not be loaded >> >> LNET configure error 19: No such device >> >> >> >> >> >> Make sure modules are unloaded, then try modprobe -v. >> >> Looks like you have lnet mis-configured, if your module >> options are >> >> wrong, you will see an error during the modprobe. >> >> cliffw >> >> >> >> --- --- >> >> >> >> >> >> I tried lustre_rmmod and depmod commands and it did not >> return >> >> any error messages. Any further clues? Reinstall patchless >> >> client again? >> >> >> >> - >> >> CS. >> >> >> >> >> >> On Tue, Jun 16, 2009 at 1:32 PM, Cliff White >> >> <Cliff.White at sun.com <mailto:Cliff.White at sun.com> >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>> >> >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com> >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>>> wrote: >> >> >> >> Carlos Santana wrote: >> >> >> >> I was able to run lustre_rmmod and depmod >> successfully. The >> >> ''$lctl list_nids'' returned the server ip address and >> >> interface >> >> (tcp0). >> >> >> >> I tried to mount the file system on a remote >> client, but it >> >> failed with the following message. >> >> --- --- >> >> [root at localhost ~]# mount -t lustre >> 10.0.0.42 at tcp0:/lustre >> >> /mnt/lustre >> >> mount.lustre: mount 10.0.0.42 at tcp0:/lustre at >> /mnt/lustre >> >> failed: No such device >> >> Are the lustre modules loaded? >> >> Check /etc/modprobe.conf and /proc/filesystems >> >> Note ''alias lustre llite'' should be removed from >> >> modprobe.conf >> >> --- --- >> >> >> >> However, the mounting is successful on a single node >> >> configuration - with client on the same machine >> as MDS >> >> and OST. >> >> Any clues? Where to look for logs and debug messages? >> >> >> >> >> >> Syslog || /var/log/messages is the normal place. >> >> >> >> You can use ''lctl ping'' to verify that the client can >> reach >> >> the server. >> >> Usually in these cases, it''s a network/name >> misconfiguration. >> >> >> >> Run ''tunefs.lustre --print'' on your servers, and >> verify that >> >> mgsnode>> >> is correct. >> >> >> >> cliffw >> >> >> >> >> >> Thanks, >> >> CS. >> >> >> >> >> >> >> >> >> >> >> >> On Tue, Jun 16, 2009 at 12:16 PM, Cliff White >> >> <Cliff.White at sun.com <mailto:Cliff.White at sun.com> >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>> >> >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com> >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>> >> >> <mailto:Cliff.White at sun.com >> <mailto:Cliff.White at sun.com> <mailto:Cliff.White at sun.com >> <mailto:Cliff.White at sun.com>> >> >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com> >> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>>>> wrote: >> >> >> >> Carlos Santana wrote: >> >> >> >> Thanks Kevin.. >> >> >> >> Please read: >> >> >> >> >> >> >> http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529 >> >> >> >> Those instructions are identical for 1.6 and 1.8. >> >> >> >> For current lustre, only two commands are used for >> >> configuration. >> >> mkfs.lustre and mount. >> >> >> >> >> >> Usually when lustre_rmmod returns that error, >> you run >> >> it a second >> >> time, and it will clear things. Unless you >> have live >> >> mounts or >> >> network connections. >> >> >> >> cliffw >> >> >> >> >> >> I am referring to 1.8 manual, but I was also >> >> referring to >> >> HowTo >> >> page on wiki which seems to be for 1.6. >> The HowTo >> >> page >> >> >> >> >> >> >> http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools >> >> mentions abt lmc, lconf, and lctl. >> >> >> >> The modules are installed in the right >> place. The ''$ >> >> lustre_rmmod'' resulted in following o/p: >> >> [root at localhost >> 2.6.18-92.1.17.el5_lustre.1.8.0smp]# >> >> lustre_rmmod >> >> ERROR: Module obdfilter is in use >> >> ERROR: Module ost is in use >> >> ERROR: Module mds is in use >> >> ERROR: Module fsfilt_ldiskfs is in use >> >> ERROR: Module mgs is in use >> >> ERROR: Module mgc is in use by mgs >> >> ERROR: Module ldiskfs is in use by >> fsfilt_ldiskfs >> >> ERROR: Module lov is in use >> >> ERROR: Module lquota is in use by >> obdfilter,mds >> >> ERROR: Module osc is in use >> >> ERROR: Module ksocklnd is in use >> >> ERROR: Module ptlrpc is in use by >> >> obdfilter,ost,mds,mgs,mgc,lov,lquota,osc >> >> ERROR: Module obdclass is in use by >> >> >> >> >> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc >> >> ERROR: Module lnet is in use by >> >> ksocklnd,ptlrpc,obdclass >> >> ERROR: Module lvfs is in use by >> >> >> >> >> >> >> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass >> >> ERROR: Module libcfs is in use by >> >> >> >> >> >> >> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs >> >> >> >> Do I need to shutdown these services? How >> can I do >> >> that? >> >> >> >> Thanks, >> >> CS. >> >> >> >> >> >> On Tue, Jun 16, 2009 at 11:36 AM, Kevin >> Van Maren >> >> <Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com> >> >> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com>> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com> >> >> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com>>> >> >> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com> >> >> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com>> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com> >> >> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com>>>> >> >> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com> >> >> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com>> >> >> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com> >> >> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com>>> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com> >> >> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com>> >> >> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com> >> >> <mailto:Kevin.Vanmaren at sun.com >> <mailto:Kevin.Vanmaren at sun.com>>>>>> >> >> >> >> wrote: >> >> >> >> I think lconf and lmc went away with Lustre >> >> 1.6. Are you >> >> sure you >> >> are looking at the 1.8 manual, and not >> >> directions for 1.4? >> >> >> >> /usr/sbin/lctl should be in the >> >> lustre-<version> RPM. >> >> Do a: >> >> # rpm -q -l >> >> lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp >> >> >> >> >> >> Do make sure the modules are installed >> in the >> >> right place: >> >> # cd /lib/modules/`uname -r` >> >> # find . | grep lustre.ko >> >> >> >> If it shows up, then do: >> >> # lustre_rmmod >> >> # depmod >> >> and try again. >> >> >> >> Otherwise, figure out where your >> modules are >> >> installed: >> >> # uname -r >> >> # cd /lib/modules >> >> # find . | grep lustre.ko >> >> >> >> >> >> You can also double-check the NID. On >> the MSD >> >> server, do >> >> # lctl list_nids >> >> >> >> Should show 10.0.0.42 at tcp0 >> >> >> >> Kevin >> >> >> >> >> >> >> >> >> >> >> >> >> ------------------------------------------------------------------------ >> >> >> >> _______________________________________________ >> >> Lustre-discuss mailing list >> >> Lustre-discuss at lists.lustre.org >> <mailto:Lustre-discuss at lists.lustre.org> >> >> <mailto:Lustre-discuss at lists.lustre.org >> <mailto:Lustre-discuss at lists.lustre.org>> >> >> <mailto:Lustre-discuss at lists.lustre.org >> <mailto:Lustre-discuss at lists.lustre.org> >> >> <mailto:Lustre-discuss at lists.lustre.org >> <mailto:Lustre-discuss at lists.lustre.org>>> >> >> >> >> >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> >> >> >> >> >> >> >> >> >> >> >> ------------------------------------------------------------------------ >> >> >> >> _______________________________________________ >> >> Lustre-discuss mailing list >> >> Lustre-discuss at lists.lustre.org >> <mailto:Lustre-discuss at lists.lustre.org> >> >> <mailto:Lustre-discuss at lists.lustre.org >> <mailto:Lustre-discuss at lists.lustre.org>> >> >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> >> >> >> >> >> >> >> >> >> ------------------------------------------------------------------------ >> >> >> >> _______________________________________________ >> >> Lustre-discuss mailing list >> >> Lustre-discuss at lists.lustre.org >> <mailto:Lustre-discuss at lists.lustre.org> >> >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> > >> > >> >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >-------------- next part -------------- A non-text attachment was scrubbed... Name: LustreInstallTable.pdf Type: application/pdf Size: 24227 bytes Desc: not available Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090617/cddaa210/attachment-0001.pdf
Arden Wiebe
2009-Jun-17 20:21 UTC
[Lustre-discuss] Lustre installation and configuration problems
Cliff: I have some questions about the client packages. I am not sure why the roadmap or lustre users require separate client packages but stating the obvious some people must need separate client packages is that correct? Otherwise the server packages contain the client anyhow correct? If the later are the client packages for linux somewhat redundant? When will the real client .exe for windows become available? Arden --- On Wed, 6/17/09, Sheila Barthel <Sheila.Barthel at Sun.COM> wrote:> From: Sheila Barthel <Sheila.Barthel at Sun.COM> > Subject: Re: [Lustre-discuss] Lustre installation and configuration problems > To: "Carlos Santana" <neubyr at gmail.com> > Cc: "Cliff White" <Cliff.White at Sun.COM>, lustre-discuss at lists.lustre.org > Date: Wednesday, June 17, 2009, 1:08 PM > Carlos - > > The installation procedures for Lustre 1.6 and 1.8 are the > same. The manual''s installation procedure includes a table > that shows which packages to install on servers and clients > (I''ve attached a PDF of the table). The procedure also > describes the installation order for packages (kernel, > modules, ldiskfs, then utilities/userspace, then > e2fsprogs). > > http://manual.lustre.org/manual/LustreManual16_HTML/LustreInstallation.html#50401389_pgfId-1291574 > > Sheila > > Cliff White wrote: > > Carlos Santana wrote: > >??? > >> Huh... :( Sorry to bug you guys again... > >> > >> I am planning to make a fresh start now as nothing > seems to have worked for me. If you have any > comments/feedback please share them. > >> > >> I would like to confirm installation order before > I make a fresh start.? From Arden''s experience: http://lists.lustre.org/pipermail/lustre-discuss/2009-June/010710.html > , the lusre-module is installed last. As I was installing > Lustre 1.8, I was referring 1.8 operations manual http://manual.lustre.org/index.php?title=Main_Page . > The installation order in the manual is different than what > Arden has suggested. > >> > >> Will it make a difference in configuration at > later stage? Which one should I follow now? > >> Any comments? > >>? ??? > > > > RPM installation order really doesn''t matter. If you > install in the ''wrong'' order you will get a lot of warnings > from RPM due to the relationship of the various RPMs. But > these are harmless - whatever order you install in, it > should work fine. > > cliffw > >??? > >> Thanks, > >> CS. > >> > >> > >> On Wed, Jun 17, 2009 at 12:35 AM, Carlos Santana > <neubyr at gmail.com > <mailto:neubyr at gmail.com>> > wrote: > >> > >>? ???Thanks Cliff. > >> > >>? ???The depmod -a was > successful before as well. I am using CentOS 5.2 > >>? ???box. Following are the > packages installed: > >>? ???[root at localhost tmp]# rpm > -qa | grep -i lustre > >>? > ???lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > >>? > ???lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > >> > >>? ???[root at localhost tmp]# > uname -a > >>? ???Linux > localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun 10 > 18:49:47 > >>? ???EDT 2008 i686 i686 i386 > GNU/Linux > >> > >>? ???And here is a output from > strace for mount: > >>? ???http://www.heypasteit.com/clip/8WT > >> > >>? ???Any further debugging > hints? > >> > >>? ???Thanks, > >>? ???CS. > >> > >>? ???On 6/16/09, Cliff White > <Cliff.White at sun.com > >>? ???<mailto:Cliff.White at sun.com>> > wrote: > >>? ? ? > Carlos Santana wrote: > >>? ? ? >> The ''$ modprobe -l > lustre*'' did not show any module on a patchless > >>? ? ? >> client. modprobe -v > returns ''FATAL: Module lustre not found''. > >>? ? ? >> > >>? ? ? >> How do I install a > patchless client? > >>? ? ? >> I have tried > lustre-client-modules and lustre-client-ver rpm > >>? ???packages in > >>? ? ? >> both sequences. Am I > missing anything? > >>? ? ? >> > >>? ? ? > > >>? ? ? > Make sure the > lustre-client-modules package matches your running > >>? ???kernel. > >>? ? ? > Run depmod -a to be sure > >>? ? ? > cliffw > >>? ? ? > > >>? ? ? >> Thanks, > >>? ? ? >> CS. > >>? ? ? >> > >>? ? ? >> > >>? ? ? >> > >>? ? ? >> On Tue, Jun 16, 2009 > at 2:28 PM, Cliff White > >>? ???<Cliff.White at sun.com > <mailto:Cliff.White at sun.com> > >>? ? ? >> <mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com>>> > wrote: > >>? ? ? >> > >>? ? ? >>? > ???Carlos Santana wrote: > >>? ? ? >> > >>? ? ? >>? ? ? > ???The lctlt ping and ''net up'' failed with > the following > >>? ???messages: > >>? ? ? >>? ? ? > ???--- --- > >>? ? ? >>? ? ? > ???[root at localhost ~]# lctl ping 10.0.0.42 > >>? ? ? >>? ? ? > ???opening /dev/lnet failed: No such device > >>? ? ? >>? ? ? > ???hint: the kernel modules may not be > loaded > >>? ? ? >>? ? ? > ???failed to ping 10.0.0.42 at tcp: No such > device > >>? ? ? >> > >>? ? ? >>? ? ? > ???[root at localhost ~]# lctl network up > >>? ? ? >>? ? ? > ???opening /dev/lnet failed: No such device > >>? ? ? >>? ? ? > ???hint: the kernel modules may not be > loaded > >>? ? ? >>? ? ? > ???LNET configure error 19: No such device > >>? ? ? >> > >>? ? ? >> > >>? ? ? >>? > ???Make sure modules are unloaded, then try > modprobe -v. > >>? ? ? >>? > ???Looks like you have lnet mis-configured, > if your module > >>? ???options are > >>? ? ? >>? > ???wrong, you will see an error during the > modprobe. > >>? ? ? >>? > ???cliffw > >>? ? ? >> > >>? ? ? >>? ? ? > ???--- --- > >>? ? ? >> > >>? ? ? >> > >>? ? ? >>? ? ? > ???I tried lustre_rmmod and depmod commands > and it did not > >>? ???return > >>? ? ? >>? ? ? > ???any error messages. Any further clues? > Reinstall patchless > >>? ? ? >>? ? ? > ???client again? > >>? ? ? >> > >>? ? ? >>? ? ? > ???- > >>? ? ? >>? ? ? > ???CS. > >>? ? ? >> > >>? ? ? >> > >>? ? ? >>? ? ? > ???On Tue, Jun 16, 2009 at 1:32 PM, Cliff > White > >>? ? ? >>? ? ? > ???<Cliff.White at sun.com > <mailto:Cliff.White at sun.com> > >>? ???<mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com>> > >>? ? ? >>? ? ? > ???<mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com> > >>? ???<mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com>>>> > wrote: > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? Carlos Santana wrote: > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? I was able to run > lustre_rmmod and depmod > >>? ???successfully. The > >>? ? ? >>? ? ? > ? ? ? ? ? ''$lctl list_nids'' > returned the server ip address and > >>? ? ? >>? ? ? > ???interface > >>? ? ? >>? ? ? > ? ? ? ? ? (tcp0). > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? I tried to mount the file > system on a remote > >>? ???client, but it > >>? ? ? >>? ? ? > ? ? ? ? ? failed with the following > message. > >>? ? ? >>? ? ? > ? ? ? ? ? --- --- > >>? ? ? >>? ? ? > ? ? ? ? ? [root at localhost ~]# mount > -t lustre > >>? ???10.0.0.42 at tcp0:/lustre > >>? ? ? >>? ? ? > ? ? ? ? ? /mnt/lustre > >>? ? ? >>? ? ? > ? ? ? ? ? mount.lustre: mount > 10.0.0.42 at tcp0:/lustre at > >>? ???/mnt/lustre > >>? ? ? >>? ? ? > ? ? ? ? ? failed: No such device > >>? ? ? >>? ? ? > ? ? ? ? ? Are the lustre modules > loaded? > >>? ? ? >>? ? ? > ? ? ? ? ? Check /etc/modprobe.conf > and /proc/filesystems > >>? ? ? >>? ? ? > ? ? ? ? ? Note ''alias lustre llite'' > should be removed from > >>? ? ? >>? ? ? > ???modprobe.conf > >>? ? ? >>? ? ? > ? ? ? ? ? --- --- > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? However, the mounting is > successful on a single node > >>? ? ? >>? ? ? > ? ? ? ? ? configuration - with > client on the same machine > >>? ???as MDS > >>? ? ? >>? ? ? > ???and OST. > >>? ? ? >>? ? ? > ? ? ? ? ? Any clues? Where to look > for logs and debug messages? > >>? ? ? >> > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? Syslog || /var/log/messages is the > normal place. > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? You can use ''lctl ping'' to verify that > the client can > >>? ???reach > >>? ? ? >>? ? ? > ???the server. > >>? ? ? >>? ? ? > ? ? ? Usually in these cases, it''s a > network/name > >>? ???misconfiguration. > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? Run ''tunefs.lustre --print'' on your > servers, and > >>? ???verify that > >>? ? ? >>? ? ? > ???mgsnode> >>? ? ? >>? ? ? > ? ? ? is correct. > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? cliffw > >>? ? ? >> > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? Thanks, > >>? ? ? >>? ? ? > ? ? ? ? ? CS. > >>? ? ? >> > >>? ? ? >> > >>? ? ? >> > >>? ? ? >> > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? On Tue, Jun 16, 2009 at > 12:16 PM, Cliff White > >>? ? ? >>? ? ? > ? ? ? ? ? <Cliff.White at sun.com > <mailto:Cliff.White at sun.com> > >>? ???<mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com>> > >>? ? ? >>? ? ? > ???<mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com> > >>? ???<mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com>>> > >>? ? ? >>? ? ? > ? ? ? ? ? <mailto:Cliff.White at sun.com > >>? ???<mailto:Cliff.White at sun.com> > <mailto:Cliff.White at sun.com > >>? ???<mailto:Cliff.White at sun.com>> > >>? ? ? >>? ? ? > ???<mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com> > >>? ???<mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com>>>>> > wrote: > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? ???Carlos > Santana wrote: > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???Thanks Kevin.. > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? ???Please > read: > >>? ? ? >> > >>? ? ? >> > >>? ? ? >> > >>? ???http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529 > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? ???Those > instructions are identical for 1.6 and 1.8. > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? ???For > current lustre, only two commands are used for > >>? ? ? >>? ? ? > ???configuration. > >>? ? ? >>? ? ? > ? ? ? ? ? > ???mkfs.lustre and mount. > >>? ? ? >> > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? ???Usually > when lustre_rmmod returns that error, > >>? ???you run > >>? ? ? >>? ? ? > ???it a second > >>? ? ? >>? ? ? > ? ? ? ? ? ???time, > and it will clear things. Unless you > >>? ???have live > >>? ? ? >>? ? ? > ???mounts or > >>? ? ? >>? ? ? > ? ? ? ? ? ???network > connections. > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? ???cliffw > >>? ? ? >> > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???I am referring to 1.8 manual, but I was > also > >>? ? ? >>? ? ? > ???referring to > >>? ? ? >>? ? ? > ? ? ? ? ? HowTo > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???page on wiki which seems to be for 1.6. > >>? ???The HowTo > >>? ? ? >> page > >>? ? ? >> > >>? ? ? >> > >>? ? ? >> > >>? ???http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???mentions abt lmc, lconf, and lctl. > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???The modules are installed in the right > >>? ???place. The ''$ > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???lustre_rmmod'' resulted in following o/p: > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???[root at localhost > >>? > ???2.6.18-92.1.17.el5_lustre.1.8.0smp]# > >>? ? ? >>? ? ? > ? ? ? ? ? lustre_rmmod > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???ERROR: Module obdfilter is in use > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???ERROR: Module ost is in use > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???ERROR: Module mds is in use > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???ERROR: Module fsfilt_ldiskfs is in use > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???ERROR: Module mgs is in use > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???ERROR: Module mgc is in use by mgs > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???ERROR: Module ldiskfs is in use by > >>? ???fsfilt_ldiskfs > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???ERROR: Module lov is in use > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???ERROR: Module lquota is in use by > >>? ???obdfilter,mds > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???ERROR: Module osc is in use > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???ERROR: Module ksocklnd is in use > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???ERROR: Module ptlrpc is in use by > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???obdfilter,ost,mds,mgs,mgc,lov,lquota,osc > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???ERROR: Module obdclass is in use by > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???ERROR: Module lnet is in use by > >>? ? ? >>? ? ? > ???ksocklnd,ptlrpc,obdclass > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???ERROR: Module lvfs is in use by > >>? ? ? >> > >>? ? ? >> > >>? ? ? >> > >>? > ???obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???ERROR: Module libcfs is in use by > >>? ? ? >> > >>? ? ? >> > >>? ? ? >> > >>? > ???obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???Do I need to shutdown these services? How > >>? ???can I do > >>? ? ? >>? ? ? > ???that? > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???Thanks, > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???CS. > >>? ? ? >> > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???On Tue, Jun 16, 2009 at 11:36 AM, Kevin > >>? ???Van Maren > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???<Kevin.Vanmaren at sun.com > >>? ???<mailto:Kevin.Vanmaren at sun.com> > >>? ? ? >>? ? ? > ???<mailto:Kevin.Vanmaren at sun.com > >>? ???<mailto:Kevin.Vanmaren at sun.com>> > <mailto:Kevin.Vanmaren at sun.com > >>? ???<mailto:Kevin.Vanmaren at sun.com> > >>? ? ? >>? ? ? > ???<mailto:Kevin.Vanmaren at sun.com > >>? ???<mailto:Kevin.Vanmaren at sun.com>>> > >>? ? ? >>? ? ? > ? ? ? ? ? <mailto:Kevin.Vanmaren at sun.com > >>? ???<mailto:Kevin.Vanmaren at sun.com> > >>? ? ? >>? ? ? > ???<mailto:Kevin.Vanmaren at sun.com > >>? ???<mailto:Kevin.Vanmaren at sun.com>> > <mailto:Kevin.Vanmaren at sun.com > >>? ???<mailto:Kevin.Vanmaren at sun.com> > >>? ? ? >>? ? ? > ???<mailto:Kevin.Vanmaren at sun.com > >>? ???<mailto:Kevin.Vanmaren at sun.com>>>> > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???<mailto:Kevin.Vanmaren at sun.com > >>? ???<mailto:Kevin.Vanmaren at sun.com> > >>? ? ? >>? ? ? > ???<mailto:Kevin.Vanmaren at sun.com > >>? ???<mailto:Kevin.Vanmaren at sun.com>> > >>? ? ? >>? ? ? > ? ? ? ? ? <mailto:Kevin.Vanmaren at sun.com > >>? ???<mailto:Kevin.Vanmaren at sun.com> > >>? ? ? >>? ? ? > ???<mailto:Kevin.Vanmaren at sun.com > >>? ???<mailto:Kevin.Vanmaren at sun.com>>> > <mailto:Kevin.Vanmaren at sun.com > >>? ???<mailto:Kevin.Vanmaren at sun.com> > >>? ? ? >>? ? ? > ???<mailto:Kevin.Vanmaren at sun.com > >>? ???<mailto:Kevin.Vanmaren at sun.com>> > >>? ? ? >>? ? ? > ? ? ? ? ? <mailto:Kevin.Vanmaren at sun.com > >>? ???<mailto:Kevin.Vanmaren at sun.com> > >>? ? ? >>? ? ? > ???<mailto:Kevin.Vanmaren at sun.com > >>? ???<mailto:Kevin.Vanmaren at sun.com>>>>>> > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???wrote: > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? ? > ? ? I think lconf and lmc went away with Lustre > >>? ? ? >>? ? ? > ???1.6.? Are you > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? > ???sure you > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? ? > ? ? are looking at the 1.8 manual, and not > >>? ? ? >>? ? ? > ???directions for 1.4? > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? ? > ? ? /usr/sbin/lctl should be in the > >>? ? ? >>? ? ? > ???lustre-<version> RPM. > >>? ? ? >>? ? ? > ? ? ? ? ???Do a: > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? ? > ? ? # rpm -q -l > >>? ? ? >>? ? ? > ? ? ? ? ? > lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > >>? ? ? >> > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? ? > ? ? Do make sure the modules are installed > >>? ???in the > >>? ? ? >>? ? ? > ???right place: > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? ? > ? ? # cd /lib/modules/`uname -r` > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? ? > ? ? # find . | grep lustre.ko > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? ? > ? ? If it shows up, then do: > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? ? > ? ? # lustre_rmmod > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? ? > ? ? # depmod > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? ? > ? ? and try again. > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? ? > ? ? Otherwise, figure out where your > >>? ???modules are > >>? ? ? >>? ? ? > ???installed: > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? ? > ? ? # uname -r > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? ? > ? ? # cd /lib/modules > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? ? > ? ? # find . | grep lustre.ko > >>? ? ? >> > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? ? > ? ? You can also double-check the NID.? On > >>? ???the MSD > >>? ? ? >>? ? ? > ???server, do > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? ? > ? ? # lctl list_nids > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? ? > ? ? Should show 10.0.0.42 at tcp0 > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? ? ? ? > ? ? Kevin > >>? ? ? >> > >>? ? ? >> > >>? ? ? >> > >>? ? ? >> > >>? ? ? >> > >>? ? ? >> > >>? > ???------------------------------------------------------------------------ > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? > _______________________________________________ > >>? ? ? >>? ? ? > ? ? ? ? ? Lustre-discuss mailing > list > >>? ? ? >>? ? ? > ? ? ? ? ? Lustre-discuss at lists.lustre.org > >>? ???<mailto:Lustre-discuss at lists.lustre.org> > >>? ? ? >>? ? ? > ???<mailto:Lustre-discuss at lists.lustre.org > >>? ???<mailto:Lustre-discuss at lists.lustre.org>> > >>? ? ? >>? ? ? > ? ? ? ? ? <mailto:Lustre-discuss at lists.lustre.org > >>? ???<mailto:Lustre-discuss at lists.lustre.org> > >>? ? ? >>? ? ? > ???<mailto:Lustre-discuss at lists.lustre.org > >>? ???<mailto:Lustre-discuss at lists.lustre.org>>> > >>? ? ? >> > >>? ? ? >>? ? ? > ? ? ? ? ? ???http://lists.lustre.org/mailman/listinfo/lustre-discuss > >>? ? ? >> > >>? ? ? >> > >>? ? ? >> > >>? ? ? >> > >>? ? ? >> > >>? ? ? >> > >>? > ???------------------------------------------------------------------------ > >>? ? ? >> > >>? ? ? >>? ? ? > ???_______________________________________________ > >>? ? ? >>? ? ? > ???Lustre-discuss mailing list > >>? ? ? >>? ? ? > ???Lustre-discuss at lists.lustre.org > >>? ???<mailto:Lustre-discuss at lists.lustre.org> > >>? ? ? >>? ? ? > ???<mailto:Lustre-discuss at lists.lustre.org > >>? ???<mailto:Lustre-discuss at lists.lustre.org>> > >>? ? ? >>? ? ? > ???http://lists.lustre.org/mailman/listinfo/lustre-discuss > >>? ? ? >> > >>? ? ? >> > >>? ? ? >> > >>? ? ? >> > >>? ? ? >> > >>? > ???------------------------------------------------------------------------ > >>? ? ? >> > >>? ? ? >> > _______________________________________________ > >>? ? ? >> Lustre-discuss > mailing list > >>? ? ? >> Lustre-discuss at lists.lustre.org > >>? ???<mailto:Lustre-discuss at lists.lustre.org> > >>? ? ? >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > >>? ? ? > > >>? ? ? > > >> > >> > >> > >> > ------------------------------------------------------------------------ > >> > >> _______________________________________________ > >> Lustre-discuss mailing list > >> Lustre-discuss at lists.lustre.org > >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > >>? ??? > > > > _______________________________________________ > > Lustre-discuss mailing list > > Lustre-discuss at lists.lustre.org > > http://lists.lustre.org/mailman/listinfo/lustre-discuss > >??? > > > -----Inline Attachment Follows----- > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >
Cliff White
2009-Jun-17 22:55 UTC
[Lustre-discuss] Lustre installation and configuration problems
Arden Wiebe wrote:> Cliff: > > I have some questions about the client packages. I am not sure why the roadmap or lustre users require separate client packages but stating the obvious some people must need separate client packages is that correct?The key here is ''patchless client'' Yes, any machine with Lustre server bits installed can be a client. Not Long Ago, there was only one installation for Lustre. Everybody got the same bits. And the Lustre design re-uses things. Note that any Lustre node connecting to a service has a ''client'' - for example the OSS is a ''client'' of the MDS, and the MDS a ''client'' of the OSS. The ''patchless client'' was created to allow users to run Lustre with a stock vendor/distro kernel. This removes a lot of support/installation issues - servers can be considered ''Lustre-only'' devices, but clients typically have other goop installed. Allowing users to use a stock distro kernel simplifies their support relationship with their other vendors.> > Otherwise the server packages contain the client anyhow correct? If the later are the client packages for linux somewhat redundant?Yes, the client packages are somewhat redundant, if you don''t mind a Lustre-patched kernel on your clients. When will the real client .exe for windows become available? No idea, see the roadmap. cliffw> > Arden > > --- On Wed, 6/17/09, Sheila Barthel <Sheila.Barthel at Sun.COM> wrote: > >> From: Sheila Barthel <Sheila.Barthel at Sun.COM> >> Subject: Re: [Lustre-discuss] Lustre installation and configuration problems >> To: "Carlos Santana" <neubyr at gmail.com> >> Cc: "Cliff White" <Cliff.White at Sun.COM>, lustre-discuss at lists.lustre.org >> Date: Wednesday, June 17, 2009, 1:08 PM >> Carlos - >> >> The installation procedures for Lustre 1.6 and 1.8 are the >> same. The manual''s installation procedure includes a table >> that shows which packages to install on servers and clients >> (I''ve attached a PDF of the table). The procedure also >> describes the installation order for packages (kernel, >> modules, ldiskfs, then utilities/userspace, then >> e2fsprogs). >> >> http://manual.lustre.org/manual/LustreManual16_HTML/LustreInstallation.html#50401389_pgfId-1291574 >> >> Sheila >> >> Cliff White wrote: >>> Carlos Santana wrote: >>> >>>> Huh... :( Sorry to bug you guys again... >>>> >>>> I am planning to make a fresh start now as nothing >> seems to have worked for me. If you have any >> comments/feedback please share them. >>>> I would like to confirm installation order before >> I make a fresh start. From Arden''s experience: http://lists.lustre.org/pipermail/lustre-discuss/2009-June/010710.html >> , the lusre-module is installed last. As I was installing >> Lustre 1.8, I was referring 1.8 operations manual http://manual.lustre.org/index.php?title=Main_Page . >> The installation order in the manual is different than what >> Arden has suggested. >>>> Will it make a difference in configuration at >> later stage? Which one should I follow now? >>>> Any comments? >>>> >>> RPM installation order really doesn''t matter. If you >> install in the ''wrong'' order you will get a lot of warnings >> from RPM due to the relationship of the various RPMs. But >> these are harmless - whatever order you install in, it >> should work fine. >>> cliffw >>> >>>> Thanks, >>>> CS. >>>> >>>> >>>> On Wed, Jun 17, 2009 at 12:35 AM, Carlos Santana >> <neubyr at gmail.com >> <mailto:neubyr at gmail.com>> >> wrote: >>>> Thanks Cliff. >>>> >>>> The depmod -a was >> successful before as well. I am using CentOS 5.2 >>>> box. Following are the >> packages installed: >>>> [root at localhost tmp]# rpm >> -qa | grep -i lustre >>>> >> lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp >>>> >> lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp >>>> [root at localhost tmp]# >> uname -a >>>> Linux >> localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun 10 >> 18:49:47 >>>> EDT 2008 i686 i686 i386 >> GNU/Linux >>>> And here is a output from >> strace for mount: >>>> http://www.heypasteit.com/clip/8WT >>>> >>>> Any further debugging >> hints? >>>> Thanks, >>>> CS. >>>> >>>> On 6/16/09, Cliff White >> <Cliff.White at sun.com >>>> <mailto:Cliff.White at sun.com>> >> wrote: >>>> > Carlos Santana wrote: >>>> >> The ''$ modprobe -l >> lustre*'' did not show any module on a patchless >>>> >> client. modprobe -v >> returns ''FATAL: Module lustre not found''. >>>> >> >>>> >> How do I install a >> patchless client? >>>> >> I have tried >> lustre-client-modules and lustre-client-ver rpm >>>> packages in >>>> >> both sequences. Am I >> missing anything? >>>> >> >>>> > >>>> > Make sure the >> lustre-client-modules package matches your running >>>> kernel. >>>> > Run depmod -a to be sure >>>> > cliffw >>>> > >>>> >> Thanks, >>>> >> CS. >>>> >> >>>> >> >>>> >> >>>> >> On Tue, Jun 16, 2009 >> at 2:28 PM, Cliff White >>>> <Cliff.White at sun.com >> <mailto:Cliff.White at sun.com> >>>> >> <mailto:Cliff.White at sun.com >> <mailto:Cliff.White at sun.com>>> >> wrote: >>>> >> >>>> >> >> Carlos Santana wrote: >>>> >> >>>> >> >> The lctlt ping and ''net up'' failed with >> the following >>>> messages: >>>> >> >> --- --- >>>> >> >> [root at localhost ~]# lctl ping 10.0.0.42 >>>> >> >> opening /dev/lnet failed: No such device >>>> >> >> hint: the kernel modules may not be >> loaded >>>> >> >> failed to ping 10.0.0.42 at tcp: No such >> device >>>> >> >>>> >> >> [root at localhost ~]# lctl network up >>>> >> >> opening /dev/lnet failed: No such device >>>> >> >> hint: the kernel modules may not be >> loaded >>>> >> >> LNET configure error 19: No such device >>>> >> >>>> >> >>>> >> >> Make sure modules are unloaded, then try >> modprobe -v. >>>> >> >> Looks like you have lnet mis-configured, >> if your module >>>> options are >>>> >> >> wrong, you will see an error during the >> modprobe. >>>> >> >> cliffw >>>> >> >>>> >> >> --- --- >>>> >> >>>> >> >>>> >> >> I tried lustre_rmmod and depmod commands >> and it did not >>>> return >>>> >> >> any error messages. Any further clues? >> Reinstall patchless >>>> >> >> client again? >>>> >> >>>> >> >> - >>>> >> >> CS. >>>> >> >>>> >> >>>> >> >> On Tue, Jun 16, 2009 at 1:32 PM, Cliff >> White >>>> >> >> <Cliff.White at sun.com >> <mailto:Cliff.White at sun.com> >>>> <mailto:Cliff.White at sun.com >> <mailto:Cliff.White at sun.com>> >>>> >> >> <mailto:Cliff.White at sun.com >> <mailto:Cliff.White at sun.com> >>>> <mailto:Cliff.White at sun.com >> <mailto:Cliff.White at sun.com>>>> >> wrote: >>>> >> >>>> >> >> Carlos Santana wrote: >>>> >> >>>> >> >> I was able to run >> lustre_rmmod and depmod >>>> successfully. The >>>> >> >> ''$lctl list_nids'' >> returned the server ip address and >>>> >> >> interface >>>> >> >> (tcp0). >>>> >> >>>> >> >> I tried to mount the file >> system on a remote >>>> client, but it >>>> >> >> failed with the following >> message. >>>> >> >> --- --- >>>> >> >> [root at localhost ~]# mount >> -t lustre >>>> 10.0.0.42 at tcp0:/lustre >>>> >> >> /mnt/lustre >>>> >> >> mount.lustre: mount >> 10.0.0.42 at tcp0:/lustre at >>>> /mnt/lustre >>>> >> >> failed: No such device >>>> >> >> Are the lustre modules >> loaded? >>>> >> >> Check /etc/modprobe.conf >> and /proc/filesystems >>>> >> >> Note ''alias lustre llite'' >> should be removed from >>>> >> >> modprobe.conf >>>> >> >> --- --- >>>> >> >>>> >> >> However, the mounting is >> successful on a single node >>>> >> >> configuration - with >> client on the same machine >>>> as MDS >>>> >> >> and OST. >>>> >> >> Any clues? Where to look >> for logs and debug messages? >>>> >> >>>> >> >>>> >> >> Syslog || /var/log/messages is the >> normal place. >>>> >> >>>> >> >> You can use ''lctl ping'' to verify that >> the client can >>>> reach >>>> >> >> the server. >>>> >> >> Usually in these cases, it''s a >> network/name >>>> misconfiguration. >>>> >> >>>> >> >> Run ''tunefs.lustre --print'' on your >> servers, and >>>> verify that >>>> >> >> mgsnode>>>> >> >> is correct. >>>> >> >>>> >> >> cliffw >>>> >> >>>> >> >>>> >> >> Thanks, >>>> >> >> CS. >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >> On Tue, Jun 16, 2009 at >> 12:16 PM, Cliff White >>>> >> >> <Cliff.White at sun.com >> <mailto:Cliff.White at sun.com> >>>> <mailto:Cliff.White at sun.com >> <mailto:Cliff.White at sun.com>> >>>> >> >> <mailto:Cliff.White at sun.com >> <mailto:Cliff.White at sun.com> >>>> <mailto:Cliff.White at sun.com >> <mailto:Cliff.White at sun.com>>> >>>> >> >> <mailto:Cliff.White at sun.com >>>> <mailto:Cliff.White at sun.com> >> <mailto:Cliff.White at sun.com >>>> <mailto:Cliff.White at sun.com>> >>>> >> >> <mailto:Cliff.White at sun.com >> <mailto:Cliff.White at sun.com> >>>> <mailto:Cliff.White at sun.com >> <mailto:Cliff.White at sun.com>>>>> >> wrote: >>>> >> >>>> >> >> Carlos >> Santana wrote: >>>> >> >>>> >> >> >> Thanks Kevin.. >>>> >> >>>> >> >> Please >> read: >>>> >> >>>> >> >>>> >> >>>> http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529 >>>> >> >>>> >> >> Those >> instructions are identical for 1.6 and 1.8. >>>> >> >>>> >> >> For >> current lustre, only two commands are used for >>>> >> >> configuration. >>>> >> >> >> mkfs.lustre and mount. >>>> >> >>>> >> >>>> >> >> Usually >> when lustre_rmmod returns that error, >>>> you run >>>> >> >> it a second >>>> >> >> time, >> and it will clear things. Unless you >>>> have live >>>> >> >> mounts or >>>> >> >> network >> connections. >>>> >> >>>> >> >> cliffw >>>> >> >>>> >> >>>> >> >> >> I am referring to 1.8 manual, but I was >> also >>>> >> >> referring to >>>> >> >> HowTo >>>> >> >> >> page on wiki which seems to be for 1.6. >>>> The HowTo >>>> >> page >>>> >> >>>> >> >>>> >> >>>> http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools >>>> >> >> >> mentions abt lmc, lconf, and lctl. >>>> >> >>>> >> >> >> The modules are installed in the right >>>> place. The ''$ >>>> >> >> >> lustre_rmmod'' resulted in following o/p: >>>> >> >> >> [root at localhost >>>> >> 2.6.18-92.1.17.el5_lustre.1.8.0smp]# >>>> >> >> lustre_rmmod >>>> >> >> >> ERROR: Module obdfilter is in use >>>> >> >> >> ERROR: Module ost is in use >>>> >> >> >> ERROR: Module mds is in use >>>> >> >> >> ERROR: Module fsfilt_ldiskfs is in use >>>> >> >> >> ERROR: Module mgs is in use >>>> >> >> >> ERROR: Module mgc is in use by mgs >>>> >> >> >> ERROR: Module ldiskfs is in use by >>>> fsfilt_ldiskfs >>>> >> >> >> ERROR: Module lov is in use >>>> >> >> >> ERROR: Module lquota is in use by >>>> obdfilter,mds >>>> >> >> >> ERROR: Module osc is in use >>>> >> >> >> ERROR: Module ksocklnd is in use >>>> >> >> >> ERROR: Module ptlrpc is in use by >>>> >> >> >> obdfilter,ost,mds,mgs,mgc,lov,lquota,osc >>>> >> >> >> ERROR: Module obdclass is in use by >>>> >> >>>> >> >> >> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc >>>> >> >> >> ERROR: Module lnet is in use by >>>> >> >> ksocklnd,ptlrpc,obdclass >>>> >> >> >> ERROR: Module lvfs is in use by >>>> >> >>>> >> >>>> >> >>>> >> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass >>>> >> >> >> ERROR: Module libcfs is in use by >>>> >> >>>> >> >>>> >> >>>> >> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs >>>> >> >>>> >> >> >> Do I need to shutdown these services? How >>>> can I do >>>> >> >> that? >>>> >> >>>> >> >> >> Thanks, >>>> >> >> >> CS. >>>> >> >>>> >> >>>> >> >> >> On Tue, Jun 16, 2009 at 11:36 AM, Kevin >>>> Van Maren >>>> >> >> >> <Kevin.Vanmaren at sun.com >>>> <mailto:Kevin.Vanmaren at sun.com> >>>> >> >> <mailto:Kevin.Vanmaren at sun.com >>>> <mailto:Kevin.Vanmaren at sun.com>> >> <mailto:Kevin.Vanmaren at sun.com >>>> <mailto:Kevin.Vanmaren at sun.com> >>>> >> >> <mailto:Kevin.Vanmaren at sun.com >>>> <mailto:Kevin.Vanmaren at sun.com>>> >>>> >> >> <mailto:Kevin.Vanmaren at sun.com >>>> <mailto:Kevin.Vanmaren at sun.com> >>>> >> >> <mailto:Kevin.Vanmaren at sun.com >>>> <mailto:Kevin.Vanmaren at sun.com>> >> <mailto:Kevin.Vanmaren at sun.com >>>> <mailto:Kevin.Vanmaren at sun.com> >>>> >> >> <mailto:Kevin.Vanmaren at sun.com >>>> <mailto:Kevin.Vanmaren at sun.com>>>> >>>> >> >> >> <mailto:Kevin.Vanmaren at sun.com >>>> <mailto:Kevin.Vanmaren at sun.com> >>>> >> >> <mailto:Kevin.Vanmaren at sun.com >>>> <mailto:Kevin.Vanmaren at sun.com>> >>>> >> >> <mailto:Kevin.Vanmaren at sun.com >>>> <mailto:Kevin.Vanmaren at sun.com> >>>> >> >> <mailto:Kevin.Vanmaren at sun.com >>>> <mailto:Kevin.Vanmaren at sun.com>>> >> <mailto:Kevin.Vanmaren at sun.com >>>> <mailto:Kevin.Vanmaren at sun.com> >>>> >> >> <mailto:Kevin.Vanmaren at sun.com >>>> <mailto:Kevin.Vanmaren at sun.com>> >>>> >> >> <mailto:Kevin.Vanmaren at sun.com >>>> <mailto:Kevin.Vanmaren at sun.com> >>>> >> >> <mailto:Kevin.Vanmaren at sun.com >>>> <mailto:Kevin.Vanmaren at sun.com>>>>>> >>>> >> >>>> >> >> >> wrote: >>>> >> >>>> >> >> >> I think lconf and lmc went away with Lustre >>>> >> >> 1.6. Are you >>>> >> >> >> sure you >>>> >> >> >> are looking at the 1.8 manual, and not >>>> >> >> directions for 1.4? >>>> >> >>>> >> >> >> /usr/sbin/lctl should be in the >>>> >> >> lustre-<version> RPM. >>>> >> >> Do a: >>>> >> >> >> # rpm -q -l >>>> >> >> >> lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp >>>> >> >>>> >> >>>> >> >> >> Do make sure the modules are installed >>>> in the >>>> >> >> right place: >>>> >> >> >> # cd /lib/modules/`uname -r` >>>> >> >> >> # find . | grep lustre.ko >>>> >> >>>> >> >> >> If it shows up, then do: >>>> >> >> >> # lustre_rmmod >>>> >> >> >> # depmod >>>> >> >> >> and try again. >>>> >> >>>> >> >> >> Otherwise, figure out where your >>>> modules are >>>> >> >> installed: >>>> >> >> >> # uname -r >>>> >> >> >> # cd /lib/modules >>>> >> >> >> # find . | grep lustre.ko >>>> >> >>>> >> >>>> >> >> >> You can also double-check the NID. On >>>> the MSD >>>> >> >> server, do >>>> >> >> >> # lctl list_nids >>>> >> >>>> >> >> >> Should show 10.0.0.42 at tcp0 >>>> >> >>>> >> >> >> Kevin >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> ------------------------------------------------------------------------ >>>> >> >>>> >> >> >> _______________________________________________ >>>> >> >> Lustre-discuss mailing >> list >>>> >> >> Lustre-discuss at lists.lustre.org >>>> <mailto:Lustre-discuss at lists.lustre.org> >>>> >> >> <mailto:Lustre-discuss at lists.lustre.org >>>> <mailto:Lustre-discuss at lists.lustre.org>> >>>> >> >> <mailto:Lustre-discuss at lists.lustre.org >>>> <mailto:Lustre-discuss at lists.lustre.org> >>>> >> >> <mailto:Lustre-discuss at lists.lustre.org >>>> <mailto:Lustre-discuss at lists.lustre.org>>> >>>> >> >>>> >> >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> ------------------------------------------------------------------------ >>>> >> >>>> >> >> _______________________________________________ >>>> >> >> Lustre-discuss mailing list >>>> >> >> Lustre-discuss at lists.lustre.org >>>> <mailto:Lustre-discuss at lists.lustre.org> >>>> >> >> <mailto:Lustre-discuss at lists.lustre.org >>>> <mailto:Lustre-discuss at lists.lustre.org>> >>>> >> >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> ------------------------------------------------------------------------ >>>> >> >>>> >> >> _______________________________________________ >>>> >> Lustre-discuss >> mailing list >>>> >> Lustre-discuss at lists.lustre.org >>>> <mailto:Lustre-discuss at lists.lustre.org> >>>> >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>>> > >>>> > >>>> >>>> >>>> >>>> >> ------------------------------------------------------------------------ >>>> _______________________________________________ >>>> Lustre-discuss mailing list >>>> Lustre-discuss at lists.lustre.org >>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>>> >>> _______________________________________________ >>> Lustre-discuss mailing list >>> Lustre-discuss at lists.lustre.org >>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>> >> >> -----Inline Attachment Follows----- >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> > > >
Carlos Santana
2009-Jun-18 00:10 UTC
[Lustre-discuss] Lustre installation and configuration problems
Folks, It been unsuccessful till now.. I made a fresh CentOS 5.2 minimum install (2.6.18-92.el5). Later, I updated kernel to 2.6.18-92.1.17 version. Here is a output from uname and rpm query: [root at localhost ~]# rpm -qa | grep lustre lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp [root at localhost ~]# uname -a Linux localhost.localdomain 2.6.18-92.1.17.el5 #1 SMP Tue Nov 4 13:45:01 EST 2008 i686 i686 i386 GNU/Linux Other details: --- --- --- [root at localhost ~]# ls -l /lib/modules | grep 2.6 drwxr-xr-x 6 root root 4096 Jun 17 18:47 2.6.18-92.1.17.el5 drwxr-xr-x 6 root root 4096 Jun 17 17:38 2.6.18-92.el5 [root at localhost modules]# find . | grep lustre ./2.6.18-92.1.17.el5/kernel/net/lustre ./2.6.18-92.1.17.el5/kernel/net/lustre/libcfs.ko ./2.6.18-92.1.17.el5/kernel/net/lustre/lnet.ko ./2.6.18-92.1.17.el5/kernel/net/lustre/ksocklnd.ko ./2.6.18-92.1.17.el5/kernel/net/lustre/ko2iblnd.ko ./2.6.18-92.1.17.el5/kernel/net/lustre/lnet_selftest.ko ./2.6.18-92.1.17.el5/kernel/fs/lustre ./2.6.18-92.1.17.el5/kernel/fs/lustre/osc.ko ./2.6.18-92.1.17.el5/kernel/fs/lustre/ptlrpc.ko ./2.6.18-92.1.17.el5/kernel/fs/lustre/obdecho.ko ./2.6.18-92.1.17.el5/kernel/fs/lustre/lvfs.ko ./2.6.18-92.1.17.el5/kernel/fs/lustre/mgc.ko ./2.6.18-92.1.17.el5/kernel/fs/lustre/llite_lloop.ko ./2.6.18-92.1.17.el5/kernel/fs/lustre/lov.ko ./2.6.18-92.1.17.el5/kernel/fs/lustre/mdc.ko ./2.6.18-92.1.17.el5/kernel/fs/lustre/lquota.ko ./2.6.18-92.1.17.el5/kernel/fs/lustre/lustre.ko ./2.6.18-92.1.17.el5/kernel/fs/lustre/obdclass.ko --- --- --- I am still having same problem. I seriously doubt, am I missing anything? I also tried a source install for ''patchless client'', however I have been consistent in its results too. Are there any configuration steps needed after rpm (or source) installation? The one that I know of is restricting interfaces in modeprobe.conf, however I have tried it on-n-off with no success. Could anyone please suggest any debugging and tests for the same? How can I provide you more valuable output to help me? Any insights? Also, I have a suggestion here. It might be good idea to check for ''uname -r'' check in RPM installation to check for matching kernel version and if not suggest for source install. Thanks for the help. I really appreciate your patience.. - Thanks, CS. On Wed, Jun 17, 2009 at 10:40 AM, Jerome, Ron<Ron.Jerome at nrc-cnrc.gc.ca> wrote:> I think the problem you have, as Cliff alluded to, is a mismatch between > your kernel version ?and the Luster kernel version modules. > > > > You have kernel ?2.6.18-92.el5? and are installing Lustre > ?2.6.18_92.1.17.el5??? Note the ?.1.17? is significant as the modules will > end up in the wrong directory.? There is an update to CentOS to bring the > kernel to the matching 2.6.18_92.1.17.el5 version you can pull it off the > CentOS mirror site in the updates directory. > > > > > > Ron. > > > > From: lustre-discuss-bounces at lists.lustre.org > [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Carlos Santana > Sent: June 17, 2009 11:21 AM > To: lustre-discuss at lists.lustre.org > Subject: Re: [Lustre-discuss] Lustre installation and configuration problems > > > > And is there any specific installation order for patchless client? Could > someone please share it with me? > > - > CS. > > On Wed, Jun 17, 2009 at 10:18 AM, Carlos Santana <neubyr at gmail.com> wrote: > > Huh... :( Sorry to bug you guys again... > > I am planning to make a fresh start now as nothing seems to have worked for > me. If you have any comments/feedback please share them. > > I would like to confirm installation order before I make a fresh start. From > Arden''s experience: > http://lists.lustre.org/pipermail/lustre-discuss/2009-June/010710.html , the > lusre-module is installed last. As I was installing Lustre 1.8, I was > referring 1.8 operations manual > http://manual.lustre.org/index.php?title=Main_Page . The installation order > in the manual is different than what Arden has suggested. > > Will it make a difference in configuration at later stage? Which one should > I follow now? > Any comments? > > Thanks, > CS. > > > > On Wed, Jun 17, 2009 at 12:35 AM, Carlos Santana <neubyr at gmail.com> wrote: > > Thanks Cliff. > > The depmod -a was successful before as well. I am using CentOS 5.2 > box. Following are the packages installed: > [root at localhost tmp]# rpm -qa | grep -i lustre > lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > > lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > > [root at localhost tmp]# uname -a > > Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun 10 18:49:47 > EDT 2008 i686 i686 i386 GNU/Linux > > And here is a output from strace for mount: > http://www.heypasteit.com/clip/8WT > > Any further debugging hints? > > Thanks, > CS. > > On 6/16/09, Cliff White <Cliff.White at sun.com> wrote: >> Carlos Santana wrote: >>> The ''$ modprobe -l lustre*'' did not show any module on a patchless >>> client. modprobe -v returns ''FATAL: Module lustre not found''. >>> >>> How do I install a patchless client? >>> I have tried lustre-client-modules and lustre-client-ver rpm packages in >>> both sequences. Am I missing anything? >>> >> >> Make sure the lustre-client-modules package matches your running kernel. >> Run depmod -a to be sure >> cliffw >> >>> Thanks, >>> CS. >>> >>> >>> >>> On Tue, Jun 16, 2009 at 2:28 PM, Cliff White <Cliff.White at sun.com >>> <mailto:Cliff.White at sun.com>> wrote: >>> >>> ? ? Carlos Santana wrote: >>> >>> ? ? ? ? The lctlt ping and ''net up'' failed with the following messages: >>> ? ? ? ? --- --- >>> ? ? ? ? [root at localhost ~]# lctl ping 10.0.0.42 >>> ? ? ? ? opening /dev/lnet failed: No such device >>> ? ? ? ? hint: the kernel modules may not be loaded >>> ? ? ? ? failed to ping 10.0.0.42 at tcp: No such device >>> >>> ? ? ? ? [root at localhost ~]# lctl network up >>> ? ? ? ? opening /dev/lnet failed: No such device >>> ? ? ? ? hint: the kernel modules may not be loaded >>> ? ? ? ? LNET configure error 19: No such device >>> >>> >>> ? ? Make sure modules are unloaded, then try modprobe -v. >>> ? ? Looks like you have lnet mis-configured, if your module options are >>> ? ? wrong, you will see an error during the modprobe. >>> ? ? cliffw >>> >>> ? ? ? ? --- --- >>> >>> >>> ? ? ? ? I tried lustre_rmmod and depmod commands and it did not return >>> ? ? ? ? any error messages. Any further clues? Reinstall patchless >>> ? ? ? ? client again? >>> >>> ? ? ? ? - >>> ? ? ? ? CS. >>> >>> >>> ? ? ? ? On Tue, Jun 16, 2009 at 1:32 PM, Cliff White >>> ? ? ? ? <Cliff.White at sun.com <mailto:Cliff.White at sun.com> >>> ? ? ? ? <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>> wrote: >>> >>> ? ? ? ? ? ?Carlos Santana wrote: >>> >>> ? ? ? ? ? ? ? ?I was able to run lustre_rmmod and depmod successfully. >>> The >>> ? ? ? ? ? ? ? ?''$lctl list_nids'' returned the server ip address and >>> ? ? ? ? interface >>> ? ? ? ? ? ? ? ?(tcp0). >>> >>> ? ? ? ? ? ? ? ?I tried to mount the file system on a remote client, but >>> it >>> ? ? ? ? ? ? ? ?failed with the following message. >>> ? ? ? ? ? ? ? ?--- --- >>> ? ? ? ? ? ? ? ?[root at localhost ~]# mount -t lustre 10.0.0.42 at tcp0:/lustre >>> ? ? ? ? ? ? ? ?/mnt/lustre >>> ? ? ? ? ? ? ? ?mount.lustre: mount 10.0.0.42 at tcp0:/lustre at /mnt/lustre >>> ? ? ? ? ? ? ? ?failed: No such device >>> ? ? ? ? ? ? ? ?Are the lustre modules loaded? >>> ? ? ? ? ? ? ? ?Check /etc/modprobe.conf and /proc/filesystems >>> ? ? ? ? ? ? ? ?Note ''alias lustre llite'' should be removed from >>> ? ? ? ? modprobe.conf >>> ? ? ? ? ? ? ? ?--- --- >>> >>> ? ? ? ? ? ? ? ?However, the mounting is successful on a single node >>> ? ? ? ? ? ? ? ?configuration - with client on the same machine as MDS >>> ? ? ? ? and OST. >>> ? ? ? ? ? ? ? ?Any clues? Where to look for logs and debug messages? >>> >>> >>> ? ? ? ? ? ?Syslog || /var/log/messages is the normal place. >>> >>> ? ? ? ? ? ?You can use ''lctl ping'' to verify that the client can reach >>> ? ? ? ? the server. >>> ? ? ? ? ? ?Usually in these cases, it''s a network/name misconfiguration. >>> >>> ? ? ? ? ? ?Run ''tunefs.lustre --print'' on your servers, and verify that >>> ? ? ? ? mgsnode>>> ? ? ? ? ? ?is correct. >>> >>> ? ? ? ? ? ?cliffw >>> >>> >>> ? ? ? ? ? ? ? ?Thanks, >>> ? ? ? ? ? ? ? ?CS. >>> >>> >>> >>> >>> >>> ? ? ? ? ? ? ? ?On Tue, Jun 16, 2009 at 12:16 PM, Cliff White >>> ? ? ? ? ? ? ? ?<Cliff.White at sun.com <mailto:Cliff.White at sun.com> >>> ? ? ? ? <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>> >>> ? ? ? ? ? ? ? ?<mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com> >>> ? ? ? ? <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>>> >>> wrote: >>> >>> ? ? ? ? ? ? ? ? ? Carlos Santana wrote: >>> >>> ? ? ? ? ? ? ? ? ? ? ? Thanks Kevin.. >>> >>> ? ? ? ? ? ? ? ? ? Please read: >>> >>> >>> >>> http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529 >>> >>> ? ? ? ? ? ? ? ? ? Those instructions are identical for 1.6 and 1.8. >>> >>> ? ? ? ? ? ? ? ? ? For current lustre, only two commands are used for >>> ? ? ? ? configuration. >>> ? ? ? ? ? ? ? ? ? mkfs.lustre and mount. >>> >>> >>> ? ? ? ? ? ? ? ? ? Usually when lustre_rmmod returns that error, you run >>> ? ? ? ? it a second >>> ? ? ? ? ? ? ? ? ? time, and it will clear things. Unless you have live >>> ? ? ? ? mounts or >>> ? ? ? ? ? ? ? ? ? network connections. >>> >>> ? ? ? ? ? ? ? ? ? cliffw >>> >>> >>> ? ? ? ? ? ? ? ? ? ? ? I am referring to 1.8 manual, but I was also >>> ? ? ? ? referring to >>> ? ? ? ? ? ? ? ?HowTo >>> ? ? ? ? ? ? ? ? ? ? ? page on wiki which seems to be for 1.6. The HowTo >>> page >>> >>> >>> >>> http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools >>> ? ? ? ? ? ? ? ? ? ? ? mentions abt lmc, lconf, and lctl. >>> >>> ? ? ? ? ? ? ? ? ? ? ? The modules are installed in the right place. The >>> ''$ >>> ? ? ? ? ? ? ? ? ? ? ? lustre_rmmod'' resulted in following o/p: >>> ? ? ? ? ? ? ? ? ? ? ? [root at localhost >>> 2.6.18-92.1.17.el5_lustre.1.8.0smp]# >>> ? ? ? ? ? ? ? ?lustre_rmmod >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module obdfilter is in use >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module ost is in use >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module mds is in use >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module fsfilt_ldiskfs is in use >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module mgs is in use >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module mgc is in use by mgs >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module ldiskfs is in use by fsfilt_ldiskfs >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module lov is in use >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module lquota is in use by obdfilter,mds >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module osc is in use >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module ksocklnd is in use >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module ptlrpc is in use by >>> ? ? ? ? ? ? ? ? ? ? ? obdfilter,ost,mds,mgs,mgc,lov,lquota,osc >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module obdclass is in use by >>> >>> ? ? ? ? obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module lnet is in use by >>> ? ? ? ? ksocklnd,ptlrpc,obdclass >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module lvfs is in use by >>> >>> >>> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module libcfs is in use by >>> >>> >>> >>> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs >>> >>> ? ? ? ? ? ? ? ? ? ? ? Do I need to shutdown these services? How can I do >>> ? ? ? ? that? >>> >>> ? ? ? ? ? ? ? ? ? ? ? Thanks, >>> ? ? ? ? ? ? ? ? ? ? ? CS. >>>
Cliff White
2009-Jun-18 00:27 UTC
[Lustre-discuss] Lustre installation and configuration problems
Carlos Santana wrote:> Folks, > > It been unsuccessful till now.. > > I made a fresh CentOS 5.2 minimum install (2.6.18-92.el5). Later, I > updated kernel to 2.6.18-92.1.17 version. Here is a output from uname > and rpm query: > > [root at localhost ~]# rpm -qa | grep lustre > lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > [root at localhost ~]# uname -a > Linux localhost.localdomain 2.6.18-92.1.17.el5 #1 SMP Tue Nov 4 > 13:45:01 EST 2008 i686 i686 i386 GNU/LinuxI think you are missing a basic point here. It''s been mentioned a few times. You don''t have a lustre-patched kernel installed. Here''s what a proper system looks like - it''s 1.6.7.2, but that doesn''t matter, 1.8.0 is the same. # rpm -qa |grep lustre lustre-1.6.7-2.6.18_92.1.17.el5_lustre.1.6.7smp kernel-lustre-smp-2.6.18-92.1.17.el5_lustre.1.6.7 lustre-ldiskfs-3.0.7-2.6.18_92.1.17.el5_lustre.1.6.7smp lustre-modules-1.6.7-2.6.18_92.1.17.el5_lustre.1.6.7smp # uname -a Linux bun2 2.6.18-92.1.17.el5_lustre.1.6.7smp #1 SMP Tue Feb 24 19:59:12 MST 2009 i686 i686 i386 GNU/Linux Notice the difference? Two additional RPMS, and the version strings of modules and kernel match _exactly_. cliffw> > Other details: > --- --- --- > [root at localhost ~]# ls -l /lib/modules | grep 2.6 > drwxr-xr-x 6 root root 4096 Jun 17 18:47 2.6.18-92.1.17.el5 > drwxr-xr-x 6 root root 4096 Jun 17 17:38 2.6.18-92.el5 > > > [root at localhost modules]# find . | grep lustre > ./2.6.18-92.1.17.el5/kernel/net/lustre > ./2.6.18-92.1.17.el5/kernel/net/lustre/libcfs.ko > ./2.6.18-92.1.17.el5/kernel/net/lustre/lnet.ko > ./2.6.18-92.1.17.el5/kernel/net/lustre/ksocklnd.ko > ./2.6.18-92.1.17.el5/kernel/net/lustre/ko2iblnd.ko > ./2.6.18-92.1.17.el5/kernel/net/lustre/lnet_selftest.ko > ./2.6.18-92.1.17.el5/kernel/fs/lustre > ./2.6.18-92.1.17.el5/kernel/fs/lustre/osc.ko > ./2.6.18-92.1.17.el5/kernel/fs/lustre/ptlrpc.ko > ./2.6.18-92.1.17.el5/kernel/fs/lustre/obdecho.ko > ./2.6.18-92.1.17.el5/kernel/fs/lustre/lvfs.ko > ./2.6.18-92.1.17.el5/kernel/fs/lustre/mgc.ko > ./2.6.18-92.1.17.el5/kernel/fs/lustre/llite_lloop.ko > ./2.6.18-92.1.17.el5/kernel/fs/lustre/lov.ko > ./2.6.18-92.1.17.el5/kernel/fs/lustre/mdc.ko > ./2.6.18-92.1.17.el5/kernel/fs/lustre/lquota.ko > ./2.6.18-92.1.17.el5/kernel/fs/lustre/lustre.ko > ./2.6.18-92.1.17.el5/kernel/fs/lustre/obdclass.ko > --- --- --- > > > I am still having same problem. I seriously doubt, am I missing anything? > I also tried a source install for ''patchless client'', however I have > been consistent in its results too. > > Are there any configuration steps needed after rpm (or source) > installation? The one that I know of is restricting interfaces in > modeprobe.conf, however I have tried it on-n-off with no success. > Could anyone please suggest any debugging and tests for the same? How > can I provide you more valuable output to help me? Any insights? > > Also, I have a suggestion here. It might be good idea to check for > ''uname -r'' check in RPM installation to check for matching kernel > version and if not suggest for source install. > > Thanks for the help. I really appreciate your patience.. > > - > Thanks, > CS. > > > On Wed, Jun 17, 2009 at 10:40 AM, Jerome, Ron<Ron.Jerome at nrc-cnrc.gc.ca> wrote: >> I think the problem you have, as Cliff alluded to, is a mismatch between >> your kernel version and the Luster kernel version modules. >> >> >> >> You have kernel ?2.6.18-92.el5? and are installing Lustre >> ?2.6.18_92.1.17.el5? Note the ?.1.17? is significant as the modules will >> end up in the wrong directory. There is an update to CentOS to bring the >> kernel to the matching 2.6.18_92.1.17.el5 version you can pull it off the >> CentOS mirror site in the updates directory. >> >> >> >> >> >> Ron. >> >> >> >> From: lustre-discuss-bounces at lists.lustre.org >> [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Carlos Santana >> Sent: June 17, 2009 11:21 AM >> To: lustre-discuss at lists.lustre.org >> Subject: Re: [Lustre-discuss] Lustre installation and configuration problems >> >> >> >> And is there any specific installation order for patchless client? Could >> someone please share it with me? >> >> - >> CS. >> >> On Wed, Jun 17, 2009 at 10:18 AM, Carlos Santana <neubyr at gmail.com> wrote: >> >> Huh... :( Sorry to bug you guys again... >> >> I am planning to make a fresh start now as nothing seems to have worked for >> me. If you have any comments/feedback please share them. >> >> I would like to confirm installation order before I make a fresh start. From >> Arden''s experience: >> http://lists.lustre.org/pipermail/lustre-discuss/2009-June/010710.html , the >> lusre-module is installed last. As I was installing Lustre 1.8, I was >> referring 1.8 operations manual >> http://manual.lustre.org/index.php?title=Main_Page . The installation order >> in the manual is different than what Arden has suggested. >> >> Will it make a difference in configuration at later stage? Which one should >> I follow now? >> Any comments? >> >> Thanks, >> CS. >> >> >> >> On Wed, Jun 17, 2009 at 12:35 AM, Carlos Santana <neubyr at gmail.com> wrote: >> >> Thanks Cliff. >> >> The depmod -a was successful before as well. I am using CentOS 5.2 >> box. Following are the packages installed: >> [root at localhost tmp]# rpm -qa | grep -i lustre >> lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp >> >> lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp >> >> [root at localhost tmp]# uname -a >> >> Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun 10 18:49:47 >> EDT 2008 i686 i686 i386 GNU/Linux >> >> And here is a output from strace for mount: >> http://www.heypasteit.com/clip/8WT >> >> Any further debugging hints? >> >> Thanks, >> CS. >> >> On 6/16/09, Cliff White <Cliff.White at sun.com> wrote: >>> Carlos Santana wrote: >>>> The ''$ modprobe -l lustre*'' did not show any module on a patchless >>>> client. modprobe -v returns ''FATAL: Module lustre not found''. >>>> >>>> How do I install a patchless client? >>>> I have tried lustre-client-modules and lustre-client-ver rpm packages in >>>> both sequences. Am I missing anything? >>>> >>> Make sure the lustre-client-modules package matches your running kernel. >>> Run depmod -a to be sure >>> cliffw >>> >>>> Thanks, >>>> CS. >>>> >>>> >>>> >>>> On Tue, Jun 16, 2009 at 2:28 PM, Cliff White <Cliff.White at sun.com >>>> <mailto:Cliff.White at sun.com>> wrote: >>>> >>>> Carlos Santana wrote: >>>> >>>> The lctlt ping and ''net up'' failed with the following messages: >>>> --- --- >>>> [root at localhost ~]# lctl ping 10.0.0.42 >>>> opening /dev/lnet failed: No such device >>>> hint: the kernel modules may not be loaded >>>> failed to ping 10.0.0.42 at tcp: No such device >>>> >>>> [root at localhost ~]# lctl network up >>>> opening /dev/lnet failed: No such device >>>> hint: the kernel modules may not be loaded >>>> LNET configure error 19: No such device >>>> >>>> >>>> Make sure modules are unloaded, then try modprobe -v. >>>> Looks like you have lnet mis-configured, if your module options are >>>> wrong, you will see an error during the modprobe. >>>> cliffw >>>> >>>> --- --- >>>> >>>> >>>> I tried lustre_rmmod and depmod commands and it did not return >>>> any error messages. Any further clues? Reinstall patchless >>>> client again? >>>> >>>> - >>>> CS. >>>> >>>> >>>> On Tue, Jun 16, 2009 at 1:32 PM, Cliff White >>>> <Cliff.White at sun.com <mailto:Cliff.White at sun.com> >>>> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>> wrote: >>>> >>>> Carlos Santana wrote: >>>> >>>> I was able to run lustre_rmmod and depmod successfully. >>>> The >>>> ''$lctl list_nids'' returned the server ip address and >>>> interface >>>> (tcp0). >>>> >>>> I tried to mount the file system on a remote client, but >>>> it >>>> failed with the following message. >>>> --- --- >>>> [root at localhost ~]# mount -t lustre 10.0.0.42 at tcp0:/lustre >>>> /mnt/lustre >>>> mount.lustre: mount 10.0.0.42 at tcp0:/lustre at /mnt/lustre >>>> failed: No such device >>>> Are the lustre modules loaded? >>>> Check /etc/modprobe.conf and /proc/filesystems >>>> Note ''alias lustre llite'' should be removed from >>>> modprobe.conf >>>> --- --- >>>> >>>> However, the mounting is successful on a single node >>>> configuration - with client on the same machine as MDS >>>> and OST. >>>> Any clues? Where to look for logs and debug messages? >>>> >>>> >>>> Syslog || /var/log/messages is the normal place. >>>> >>>> You can use ''lctl ping'' to verify that the client can reach >>>> the server. >>>> Usually in these cases, it''s a network/name misconfiguration. >>>> >>>> Run ''tunefs.lustre --print'' on your servers, and verify that >>>> mgsnode>>>> is correct. >>>> >>>> cliffw >>>> >>>> >>>> Thanks, >>>> CS. >>>> >>>> >>>> >>>> >>>> >>>> On Tue, Jun 16, 2009 at 12:16 PM, Cliff White >>>> <Cliff.White at sun.com <mailto:Cliff.White at sun.com> >>>> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>> >>>> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com> >>>> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>>> >>>> wrote: >>>> >>>> Carlos Santana wrote: >>>> >>>> Thanks Kevin.. >>>> >>>> Please read: >>>> >>>> >>>> >>>> http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529 >>>> >>>> Those instructions are identical for 1.6 and 1.8. >>>> >>>> For current lustre, only two commands are used for >>>> configuration. >>>> mkfs.lustre and mount. >>>> >>>> >>>> Usually when lustre_rmmod returns that error, you run >>>> it a second >>>> time, and it will clear things. Unless you have live >>>> mounts or >>>> network connections. >>>> >>>> cliffw >>>> >>>> >>>> I am referring to 1.8 manual, but I was also >>>> referring to >>>> HowTo >>>> page on wiki which seems to be for 1.6. The HowTo >>>> page >>>> >>>> >>>> >>>> http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools >>>> mentions abt lmc, lconf, and lctl. >>>> >>>> The modules are installed in the right place. The >>>> ''$ >>>> lustre_rmmod'' resulted in following o/p: >>>> [root at localhost >>>> 2.6.18-92.1.17.el5_lustre.1.8.0smp]# >>>> lustre_rmmod >>>> ERROR: Module obdfilter is in use >>>> ERROR: Module ost is in use >>>> ERROR: Module mds is in use >>>> ERROR: Module fsfilt_ldiskfs is in use >>>> ERROR: Module mgs is in use >>>> ERROR: Module mgc is in use by mgs >>>> ERROR: Module ldiskfs is in use by fsfilt_ldiskfs >>>> ERROR: Module lov is in use >>>> ERROR: Module lquota is in use by obdfilter,mds >>>> ERROR: Module osc is in use >>>> ERROR: Module ksocklnd is in use >>>> ERROR: Module ptlrpc is in use by >>>> obdfilter,ost,mds,mgs,mgc,lov,lquota,osc >>>> ERROR: Module obdclass is in use by >>>> >>>> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc >>>> ERROR: Module lnet is in use by >>>> ksocklnd,ptlrpc,obdclass >>>> ERROR: Module lvfs is in use by >>>> >>>> >>>> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass >>>> ERROR: Module libcfs is in use by >>>> >>>> >>>> >>>> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs >>>> >>>> Do I need to shutdown these services? How can I do >>>> that? >>>> >>>> Thanks, >>>> CS. >>>> > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
Arden Wiebe
2009-Jun-18 00:36 UTC
[Lustre-discuss] Lustre installation and configuration problems
Carlos: This client of mine works. Matter of fact on all my clients it works. [root at lustreone]# rpm -qa | grep -i lustre lustre-ldiskfs-3.0.8-2.6.18_92.1.17.el5_lustre.1.8.0smp lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp kernel-lustre-smp-2.6.18-92.1.17.el5_lustre.1.8.0 Otherwise your output for the same command lists only 2 packages installed so you are missing some packages - those being the client packages if you don''t want to use the patched kernel method of making a client as I have done above. If you issue the rpm commands I mentioned in the very first response of this thread you will have a working client. Arden --- On Wed, 6/17/09, Carlos Santana <neubyr at gmail.com> wrote:> From: Carlos Santana <neubyr at gmail.com> > Subject: Re: [Lustre-discuss] Lustre installation and configuration problems > To: "Jerome, Ron" <Ron.Jerome at nrc-cnrc.gc.ca> > Cc: lustre-discuss at lists.lustre.org > Date: Wednesday, June 17, 2009, 5:10 PM > Folks, > > It been unsuccessful till now.. > > I made a fresh CentOS 5.2 minimum install (2.6.18-92.el5). > Later, I > updated kernel to 2.6.18-92.1.17 version. Here is a output > from uname > and rpm query: > > [root at localhost ~]# rpm -qa | grep lustre > lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > [root at localhost ~]# uname -a > Linux localhost.localdomain 2.6.18-92.1.17.el5 #1 SMP Tue > Nov 4 > 13:45:01 EST 2008 i686 i686 i386 GNU/Linux > > Other details: > --- --- --- > [root at localhost ~]# ls -l /lib/modules | grep 2.6 > drwxr-xr-x 6 root root 4096 Jun 17 18:47 > 2.6.18-92.1.17.el5 > drwxr-xr-x 6 root root 4096 Jun 17 17:38 2.6.18-92.el5 > > > [root at localhost modules]# find . | grep lustre > ./2.6.18-92.1.17.el5/kernel/net/lustre > ./2.6.18-92.1.17.el5/kernel/net/lustre/libcfs.ko > ./2.6.18-92.1.17.el5/kernel/net/lustre/lnet.ko > ./2.6.18-92.1.17.el5/kernel/net/lustre/ksocklnd.ko > ./2.6.18-92.1.17.el5/kernel/net/lustre/ko2iblnd.ko > ./2.6.18-92.1.17.el5/kernel/net/lustre/lnet_selftest.ko > ./2.6.18-92.1.17.el5/kernel/fs/lustre > ./2.6.18-92.1.17.el5/kernel/fs/lustre/osc.ko > ./2.6.18-92.1.17.el5/kernel/fs/lustre/ptlrpc.ko > ./2.6.18-92.1.17.el5/kernel/fs/lustre/obdecho.ko > ./2.6.18-92.1.17.el5/kernel/fs/lustre/lvfs.ko > ./2.6.18-92.1.17.el5/kernel/fs/lustre/mgc.ko > ./2.6.18-92.1.17.el5/kernel/fs/lustre/llite_lloop.ko > ./2.6.18-92.1.17.el5/kernel/fs/lustre/lov.ko > ./2.6.18-92.1.17.el5/kernel/fs/lustre/mdc.ko > ./2.6.18-92.1.17.el5/kernel/fs/lustre/lquota.ko > ./2.6.18-92.1.17.el5/kernel/fs/lustre/lustre.ko > ./2.6.18-92.1.17.el5/kernel/fs/lustre/obdclass.ko > --- --- --- > > > I am still having same problem. I seriously doubt, am I > missing anything? > I also tried a source install for ''patchless client'', > however I have > been consistent in its results too. > > Are there any configuration steps needed after rpm (or > source) > installation? The one that I know of is restricting > interfaces in > modeprobe.conf, however I have tried it on-n-off with no > success. > Could anyone please suggest any debugging and tests for the > same? How > can I provide you more valuable output to help me? Any > insights? > > Also, I have a suggestion here. It might be good idea to > check for > ''uname -r'' check in RPM installation to check for matching > kernel > version and if not suggest for source install. > > Thanks for the help. I really appreciate your patience.. > > - > Thanks, > CS. > > > On Wed, Jun 17, 2009 at 10:40 AM, Jerome, Ron<Ron.Jerome at nrc-cnrc.gc.ca> > wrote: > > I think the problem you have, as Cliff alluded to, is > a mismatch between > > your kernel version ?and the Luster kernel version > modules. > > > > > > > > You have kernel ?2.6.18-92.el5? and are installing > Lustre > > ?2.6.18_92.1.17.el5??? Note the ?.1.17? is > significant as the modules will > > end up in the wrong directory.? There is an update to > CentOS to bring the > > kernel to the matching 2.6.18_92.1.17.el5 version you > can pull it off the > > CentOS mirror site in the updates directory. > > > > > > > > > > > > Ron. > > > > > > > > From: lustre-discuss-bounces at lists.lustre.org > > [mailto:lustre-discuss-bounces at lists.lustre.org] > On Behalf Of Carlos Santana > > Sent: June 17, 2009 11:21 AM > > To: lustre-discuss at lists.lustre.org > > Subject: Re: [Lustre-discuss] Lustre installation and > configuration problems > > > > > > > > And is there any specific installation order for > patchless client? Could > > someone please share it with me? > > > > - > > CS. > > > > On Wed, Jun 17, 2009 at 10:18 AM, Carlos Santana > <neubyr at gmail.com> > wrote: > > > > Huh... :( Sorry to bug you guys again... > > > > I am planning to make a fresh start now as nothing > seems to have worked for > > me. If you have any comments/feedback please share > them. > > > > I would like to confirm installation order before I > make a fresh start. From > > Arden''s experience: > > http://lists.lustre.org/pipermail/lustre-discuss/2009-June/010710.html > , the > > lusre-module is installed last. As I was installing > Lustre 1.8, I was > > referring 1.8 operations manual > > http://manual.lustre.org/index.php?title=Main_Page . > The installation order > > in the manual is different than what Arden has > suggested. > > > > Will it make a difference in configuration at later > stage? Which one should > > I follow now? > > Any comments? > > > > Thanks, > > CS. > > > > > > > > On Wed, Jun 17, 2009 at 12:35 AM, Carlos Santana > <neubyr at gmail.com> > wrote: > > > > Thanks Cliff. > > > > The depmod -a was successful before as well. I am > using CentOS 5.2 > > box. Following are the packages installed: > > [root at localhost tmp]# rpm -qa | grep -i lustre > > > lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > > > > lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > > > > [root at localhost tmp]# uname -a > > > > Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue > Jun 10 18:49:47 > > EDT 2008 i686 i686 i386 GNU/Linux > > > > And here is a output from strace for mount: > > http://www.heypasteit.com/clip/8WT > > > > Any further debugging hints? > > > > Thanks, > > CS. > > > > On 6/16/09, Cliff White <Cliff.White at sun.com> > wrote: > >> Carlos Santana wrote: > >>> The ''$ modprobe -l lustre*'' did not show any > module on a patchless > >>> client. modprobe -v returns ''FATAL: Module > lustre not found''. > >>> > >>> How do I install a patchless client? > >>> I have tried lustre-client-modules and > lustre-client-ver rpm packages in > >>> both sequences. Am I missing anything? > >>> > >> > >> Make sure the lustre-client-modules package > matches your running kernel. > >> Run depmod -a to be sure > >> cliffw > >> > >>> Thanks, > >>> CS. > >>> > >>> > >>> > >>> On Tue, Jun 16, 2009 at 2:28 PM, Cliff White > <Cliff.White at sun.com > >>> <mailto:Cliff.White at sun.com>> > wrote: > >>> > >>> ? ? Carlos Santana wrote: > >>> > >>> ? ? ? ? The lctlt ping and ''net up'' failed > with the following messages: > >>> ? ? ? ? --- --- > >>> ? ? ? ? [root at localhost ~]# lctl ping > 10.0.0.42 > >>> ? ? ? ? opening /dev/lnet failed: No such > device > >>> ? ? ? ? hint: the kernel modules may not > be loaded > >>> ? ? ? ? failed to ping 10.0.0.42 at tcp: No > such device > >>> > >>> ? ? ? ? [root at localhost ~]# lctl network > up > >>> ? ? ? ? opening /dev/lnet failed: No such > device > >>> ? ? ? ? hint: the kernel modules may not > be loaded > >>> ? ? ? ? LNET configure error 19: No such > device > >>> > >>> > >>> ? ? Make sure modules are unloaded, then try > modprobe -v. > >>> ? ? Looks like you have lnet mis-configured, > if your module options are > >>> ? ? wrong, you will see an error during the > modprobe. > >>> ? ? cliffw > >>> > >>> ? ? ? ? --- --- > >>> > >>> > >>> ? ? ? ? I tried lustre_rmmod and depmod > commands and it did not return > >>> ? ? ? ? any error messages. Any further > clues? Reinstall patchless > >>> ? ? ? ? client again? > >>> > >>> ? ? ? ? - > >>> ? ? ? ? CS. > >>> > >>> > >>> ? ? ? ? On Tue, Jun 16, 2009 at 1:32 PM, > Cliff White > >>> ? ? ? ? <Cliff.White at sun.com > <mailto:Cliff.White at sun.com> > >>> ? ? ? ? <mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com>>> > wrote: > >>> > >>> ? ? ? ? ? ?Carlos Santana wrote: > >>> > >>> ? ? ? ? ? ? ? ?I was able to run > lustre_rmmod and depmod successfully. > >>> The > >>> ? ? ? ? ? ? ? ?''$lctl list_nids'' > returned the server ip address and > >>> ? ? ? ? interface > >>> ? ? ? ? ? ? ? ?(tcp0). > >>> > >>> ? ? ? ? ? ? ? ?I tried to mount the > file system on a remote client, but > >>> it > >>> ? ? ? ? ? ? ? ?failed with the > following message. > >>> ? ? ? ? ? ? ? ?--- --- > >>> ? ? ? ? ? ? ? ?[root at localhost ~]# > mount -t lustre 10.0.0.42 at tcp0:/lustre > >>> ? ? ? ? ? ? ? ?/mnt/lustre > >>> ? ? ? ? ? ? ? ?mount.lustre: mount > 10.0.0.42 at tcp0:/lustre at /mnt/lustre > >>> ? ? ? ? ? ? ? ?failed: No such device > >>> ? ? ? ? ? ? ? ?Are the lustre modules > loaded? > >>> ? ? ? ? ? ? ? ?Check > /etc/modprobe.conf and /proc/filesystems > >>> ? ? ? ? ? ? ? ?Note ''alias lustre > llite'' should be removed from > >>> ? ? ? ? modprobe.conf > >>> ? ? ? ? ? ? ? ?--- --- > >>> > >>> ? ? ? ? ? ? ? ?However, the mounting > is successful on a single node > >>> ? ? ? ? ? ? ? ?configuration - with > client on the same machine as MDS > >>> ? ? ? ? and OST. > >>> ? ? ? ? ? ? ? ?Any clues? Where to > look for logs and debug messages? > >>> > >>> > >>> ? ? ? ? ? ?Syslog || /var/log/messages > is the normal place. > >>> > >>> ? ? ? ? ? ?You can use ''lctl ping'' to > verify that the client can reach > >>> ? ? ? ? the server. > >>> ? ? ? ? ? ?Usually in these cases, it''s > a network/name misconfiguration. > >>> > >>> ? ? ? ? ? ?Run ''tunefs.lustre --print'' > on your servers, and verify that > >>> ? ? ? ? mgsnode> >>> ? ? ? ? ? ?is correct. > >>> > >>> ? ? ? ? ? ?cliffw > >>> > >>> > >>> ? ? ? ? ? ? ? ?Thanks, > >>> ? ? ? ? ? ? ? ?CS. > >>> > >>> > >>> > >>> > >>> > >>> ? ? ? ? ? ? ? ?On Tue, Jun 16, 2009 at > 12:16 PM, Cliff White > >>> ? ? ? ? ? ? ? ?<Cliff.White at sun.com > <mailto:Cliff.White at sun.com> > >>> ? ? ? ? <mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com>> > >>> ? ? ? ? ? ? ? ?<mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com> > >>> ? ? ? ? <mailto:Cliff.White at sun.com > <mailto:Cliff.White at sun.com>>>> > >>> wrote: > >>> > >>> ? ? ? ? ? ? ? ? ? Carlos Santana > wrote: > >>> > >>> ? ? ? ? ? ? ? ? ? ? ? Thanks > Kevin.. > >>> > >>> ? ? ? ? ? ? ? ? ? Please read: > >>> > >>> > >>> > >>> http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529 > >>> > >>> ? ? ? ? ? ? ? ? ? Those instructions > are identical for 1.6 and 1.8. > >>> > >>> ? ? ? ? ? ? ? ? ? For current lustre, > only two commands are used for > >>> ? ? ? ? configuration. > >>> ? ? ? ? ? ? ? ? ? mkfs.lustre and > mount. > >>> > >>> > >>> ? ? ? ? ? ? ? ? ? Usually when > lustre_rmmod returns that error, you run > >>> ? ? ? ? it a second > >>> ? ? ? ? ? ? ? ? ? time, and it will > clear things. Unless you have live > >>> ? ? ? ? mounts or > >>> ? ? ? ? ? ? ? ? ? network > connections. > >>> > >>> ? ? ? ? ? ? ? ? ? cliffw > >>> > >>> > >>> ? ? ? ? ? ? ? ? ? ? ? I am > referring to 1.8 manual, but I was also > >>> ? ? ? ? referring to > >>> ? ? ? ? ? ? ? ?HowTo > >>> ? ? ? ? ? ? ? ? ? ? ? page on wiki > which seems to be for 1.6. The HowTo > >>> page > >>> > >>> > >>> > >>> http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools > >>> ? ? ? ? ? ? ? ? ? ? ? mentions abt > lmc, lconf, and lctl. > >>> > >>> ? ? ? ? ? ? ? ? ? ? ? The modules > are installed in the right place. The > >>> ''$ > >>> ? ? ? ? ? ? ? ? ? ? ? lustre_rmmod'' > resulted in following o/p: > >>> ? ? ? ? ? ? ? ? ? ? ? > [root at localhost > >>> 2.6.18-92.1.17.el5_lustre.1.8.0smp]# > >>> ? ? ? ? ? ? ? ?lustre_rmmod > >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module > obdfilter is in use > >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module > ost is in use > >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module > mds is in use > >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module > fsfilt_ldiskfs is in use > >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module > mgs is in use > >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module > mgc is in use by mgs > >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module > ldiskfs is in use by fsfilt_ldiskfs > >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module > lov is in use > >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module > lquota is in use by obdfilter,mds > >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module > osc is in use > >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module > ksocklnd is in use > >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module > ptlrpc is in use by > >>> ? ? ? ? ? ? ? ? ? ? ? > obdfilter,ost,mds,mgs,mgc,lov,lquota,osc > >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module > obdclass is in use by > >>> > >>> ? ? ? ? > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc > >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module > lnet is in use by > >>> ? ? ? ? ksocklnd,ptlrpc,obdclass > >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module > lvfs is in use by > >>> > >>> > >>> > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass > >>> ? ? ? ? ? ? ? ? ? ? ? ERROR: Module > libcfs is in use by > >>> > >>> > >>> > >>> > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs > >>> > >>> ? ? ? ? ? ? ? ? ? ? ? Do I need to > shutdown these services? How can I do > >>> ? ? ? ? that? > >>> > >>> ? ? ? ? ? ? ? ? ? ? ? Thanks, > >>> ? ? ? ? ? ? ? ? ? ? ? CS. > >>> > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >
Carlos Santana
2009-Jun-19 18:51 UTC
[Lustre-discuss] Lustre installation and configuration problems
Guys, Thanks a lot for all the help.. I was able to build a patchless client from source. The basic verification tests (unix commands) were successful. I had an issue with latest CentOS kernel - 2.6.18-128.el5 though. Since I started with minimum install (withou gcc) and then installed gcc thru yum, which had dependency on kernel-headers package. By default CentOS 5.2 selects package from updates repo. So one may end up with 2.6.18-92.el5 for kernel and 2.6.18-128.el5 for kernel-headers. I also tried building it against latest 2.6.18-128.el5 kernel, however it had an issue as pointed out here: http://lists.lustre.org/pipermail/lustre-discuss/2009-May/010560.html (bug fixed: https://bugzilla.lustre.org/show_bug.cgi?id=19024 ). Thank you everone. Excited to get started with lustre.. - CS. On Wed, Jun 17, 2009 at 7:36 PM, Arden Wiebe <albert682 at yahoo.com> wrote:> > Carlos: > > This client of mine works. Matter of fact on all my clients it works. > > [root at lustreone]# rpm -qa | grep -i lustre > lustre-ldiskfs-3.0.8-2.6.18_92.1.17.el5_lustre.1.8.0smp > lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > kernel-lustre-smp-2.6.18-92.1.17.el5_lustre.1.8.0 > > Otherwise your output for the same command lists only 2 packages installed > so you are missing some packages - those being the client packages if you > don''t want to use the patched kernel method of making a client as I have > done above. If you issue the rpm commands I mentioned in the very first > response of this thread you will have a working client. > > Arden > > --- On Wed, 6/17/09, Carlos Santana <neubyr at gmail.com> wrote: > > > From: Carlos Santana <neubyr at gmail.com> > > Subject: Re: [Lustre-discuss] Lustre installation and configuration > problems > > To: "Jerome, Ron" <Ron.Jerome at nrc-cnrc.gc.ca> > > Cc: lustre-discuss at lists.lustre.org > > Date: Wednesday, June 17, 2009, 5:10 PM > > Folks, > > > > It been unsuccessful till now.. > > > > I made a fresh CentOS 5.2 minimum install (2.6.18-92.el5). > > Later, I > > updated kernel to 2.6.18-92.1.17 version. Here is a output > > from uname > > and rpm query: > > > > [root at localhost ~]# rpm -qa | grep lustre > > lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > > lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > > [root at localhost ~]# uname -a > > Linux localhost.localdomain 2.6.18-92.1.17.el5 #1 SMP Tue > > Nov 4 > > 13:45:01 EST 2008 i686 i686 i386 GNU/Linux > > > > Other details: > > --- --- --- > > [root at localhost ~]# ls -l /lib/modules | grep 2.6 > > drwxr-xr-x 6 root root 4096 Jun 17 18:47 > > 2.6.18-92.1.17.el5 > > drwxr-xr-x 6 root root 4096 Jun 17 17:38 2.6.18-92.el5 > > > > > > [root at localhost modules]# find . | grep lustre > > ./2.6.18-92.1.17.el5/kernel/net/lustre > > ./2.6.18-92.1.17.el5/kernel/net/lustre/libcfs.ko > > ./2.6.18-92.1.17.el5/kernel/net/lustre/lnet.ko > > ./2.6.18-92.1.17.el5/kernel/net/lustre/ksocklnd.ko > > ./2.6.18-92.1.17.el5/kernel/net/lustre/ko2iblnd.ko > > ./2.6.18-92.1.17.el5/kernel/net/lustre/lnet_selftest.ko > > ./2.6.18-92.1.17.el5/kernel/fs/lustre > > ./2.6.18-92.1.17.el5/kernel/fs/lustre/osc.ko > > ./2.6.18-92.1.17.el5/kernel/fs/lustre/ptlrpc.ko > > ./2.6.18-92.1.17.el5/kernel/fs/lustre/obdecho.ko > > ./2.6.18-92.1.17.el5/kernel/fs/lustre/lvfs.ko > > ./2.6.18-92.1.17.el5/kernel/fs/lustre/mgc.ko > > ./2.6.18-92.1.17.el5/kernel/fs/lustre/llite_lloop.ko > > ./2.6.18-92.1.17.el5/kernel/fs/lustre/lov.ko > > ./2.6.18-92.1.17.el5/kernel/fs/lustre/mdc.ko > > ./2.6.18-92.1.17.el5/kernel/fs/lustre/lquota.ko > > ./2.6.18-92.1.17.el5/kernel/fs/lustre/lustre.ko > > ./2.6.18-92.1.17.el5/kernel/fs/lustre/obdclass.ko > > --- --- --- > > > > > > I am still having same problem. I seriously doubt, am I > > missing anything? > > I also tried a source install for ''patchless client'', > > however I have > > been consistent in its results too. > > > > Are there any configuration steps needed after rpm (or > > source) > > installation? The one that I know of is restricting > > interfaces in > > modeprobe.conf, however I have tried it on-n-off with no > > success. > > Could anyone please suggest any debugging and tests for the > > same? How > > can I provide you more valuable output to help me? Any > > insights? > > > > Also, I have a suggestion here. It might be good idea to > > check for > > ''uname -r'' check in RPM installation to check for matching > > kernel > > version and if not suggest for source install. > > > > Thanks for the help. I really appreciate your patience.. > > > > - > > Thanks, > > CS. > > > > > > On Wed, Jun 17, 2009 at 10:40 AM, Jerome, Ron<Ron.Jerome at nrc-cnrc.gc.ca> > > wrote: > > > I think the problem you have, as Cliff alluded to, is > > a mismatch between > > > your kernel version and the Luster kernel version > > modules. > > > > > > > > > > > > You have kernel ?2.6.18-92.el5? and are installing > > Lustre > > > ?2.6.18_92.1.17.el5? Note the ?.1.17? is > > significant as the modules will > > > end up in the wrong directory. There is an update to > > CentOS to bring the > > > kernel to the matching 2.6.18_92.1.17.el5 version you > > can pull it off the > > > CentOS mirror site in the updates directory. > > > > > > > > > > > > > > > > > > Ron. > > > > > > > > > > > > From: lustre-discuss-bounces at lists.lustre.org > > > [mailto:lustre-discuss-bounces at lists.lustre.org] > > On Behalf Of Carlos Santana > > > Sent: June 17, 2009 11:21 AM > > > To: lustre-discuss at lists.lustre.org > > > Subject: Re: [Lustre-discuss] Lustre installation and > > configuration problems > > > > > > > > > > > > And is there any specific installation order for > > patchless client? Could > > > someone please share it with me? > > > > > > - > > > CS. > > > > > > On Wed, Jun 17, 2009 at 10:18 AM, Carlos Santana > > <neubyr at gmail.com> > > wrote: > > > > > > Huh... :( Sorry to bug you guys again... > > > > > > I am planning to make a fresh start now as nothing > > seems to have worked for > > > me. If you have any comments/feedback please share > > them. > > > > > > I would like to confirm installation order before I > > make a fresh start. From > > > Arden''s experience: > > > http://lists.lustre.org/pipermail/lustre-discuss/2009-June/010710.html > > , the > > > lusre-module is installed last. As I was installing > > Lustre 1.8, I was > > > referring 1.8 operations manual > > > http://manual.lustre.org/index.php?title=Main_Page . > > The installation order > > > in the manual is different than what Arden has > > suggested. > > > > > > Will it make a difference in configuration at later > > stage? Which one should > > > I follow now? > > > Any comments? > > > > > > Thanks, > > > CS. > > > > > > > > > > > > On Wed, Jun 17, 2009 at 12:35 AM, Carlos Santana > > <neubyr at gmail.com> > > wrote: > > > > > > Thanks Cliff. > > > > > > The depmod -a was successful before as well. I am > > using CentOS 5.2 > > > box. Following are the packages installed: > > > [root at localhost tmp]# rpm -qa | grep -i lustre > > > > > lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > > > > > > lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > > > > > > [root at localhost tmp]# uname -a > > > > > > Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue > > Jun 10 18:49:47 > > > EDT 2008 i686 i686 i386 GNU/Linux > > > > > > And here is a output from strace for mount: > > > http://www.heypasteit.com/clip/8WT > > > > > > Any further debugging hints? > > > > > > Thanks, > > > CS. > > > > > > On 6/16/09, Cliff White <Cliff.White at sun.com> > > wrote: > > >> Carlos Santana wrote: > > >>> The ''$ modprobe -l lustre*'' did not show any > > module on a patchless > > >>> client. modprobe -v returns ''FATAL: Module > > lustre not found''. > > >>> > > >>> How do I install a patchless client? > > >>> I have tried lustre-client-modules and > > lustre-client-ver rpm packages in > > >>> both sequences. Am I missing anything? > > >>> > > >> > > >> Make sure the lustre-client-modules package > > matches your running kernel. > > >> Run depmod -a to be sure > > >> cliffw > > >> > > >>> Thanks, > > >>> CS. > > >>> > > >>> > > >>> > > >>> On Tue, Jun 16, 2009 at 2:28 PM, Cliff White > > <Cliff.White at sun.com > > >>> <mailto:Cliff.White at sun.com>> > > wrote: > > >>> > > >>> Carlos Santana wrote: > > >>> > > >>> The lctlt ping and ''net up'' failed > > with the following messages: > > >>> --- --- > > >>> [root at localhost ~]# lctl ping > > 10.0.0.42 > > >>> opening /dev/lnet failed: No such > > device > > >>> hint: the kernel modules may not > > be loaded > > >>> failed to ping 10.0.0.42 at tcp: No > > such device > > >>> > > >>> [root at localhost ~]# lctl network > > up > > >>> opening /dev/lnet failed: No such > > device > > >>> hint: the kernel modules may not > > be loaded > > >>> LNET configure error 19: No such > > device > > >>> > > >>> > > >>> Make sure modules are unloaded, then try > > modprobe -v. > > >>> Looks like you have lnet mis-configured, > > if your module options are > > >>> wrong, you will see an error during the > > modprobe. > > >>> cliffw > > >>> > > >>> --- --- > > >>> > > >>> > > >>> I tried lustre_rmmod and depmod > > commands and it did not return > > >>> any error messages. Any further > > clues? Reinstall patchless > > >>> client again? > > >>> > > >>> - > > >>> CS. > > >>> > > >>> > > >>> On Tue, Jun 16, 2009 at 1:32 PM, > > Cliff White > > >>> <Cliff.White at sun.com > > <mailto:Cliff.White at sun.com> > > >>> <mailto:Cliff.White at sun.com > > <mailto:Cliff.White at sun.com>>> > > wrote: > > >>> > > >>> Carlos Santana wrote: > > >>> > > >>> I was able to run > > lustre_rmmod and depmod successfully. > > >>> The > > >>> ''$lctl list_nids'' > > returned the server ip address and > > >>> interface > > >>> (tcp0). > > >>> > > >>> I tried to mount the > > file system on a remote client, but > > >>> it > > >>> failed with the > > following message. > > >>> --- --- > > >>> [root at localhost ~]# > > mount -t lustre 10.0.0.42 at tcp0:/lustre > > >>> /mnt/lustre > > >>> mount.lustre: mount > > 10.0.0.42 at tcp0:/lustre at /mnt/lustre > > >>> failed: No such device > > >>> Are the lustre modules > > loaded? > > >>> Check > > /etc/modprobe.conf and /proc/filesystems > > >>> Note ''alias lustre > > llite'' should be removed from > > >>> modprobe.conf > > >>> --- --- > > >>> > > >>> However, the mounting > > is successful on a single node > > >>> configuration - with > > client on the same machine as MDS > > >>> and OST. > > >>> Any clues? Where to > > look for logs and debug messages? > > >>> > > >>> > > >>> Syslog || /var/log/messages > > is the normal place. > > >>> > > >>> You can use ''lctl ping'' to > > verify that the client can reach > > >>> the server. > > >>> Usually in these cases, it''s > > a network/name misconfiguration. > > >>> > > >>> Run ''tunefs.lustre --print'' > > on your servers, and verify that > > >>> mgsnode> > >>> is correct. > > >>> > > >>> cliffw > > >>> > > >>> > > >>> Thanks, > > >>> CS. > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> On Tue, Jun 16, 2009 at > > 12:16 PM, Cliff White > > >>> <Cliff.White at sun.com > > <mailto:Cliff.White at sun.com> > > >>> <mailto:Cliff.White at sun.com > > <mailto:Cliff.White at sun.com>> > > >>> <mailto:Cliff.White at sun.com > > <mailto:Cliff.White at sun.com> > > >>> <mailto:Cliff.White at sun.com > > <mailto:Cliff.White at sun.com>>>> > > >>> wrote: > > >>> > > >>> Carlos Santana > > wrote: > > >>> > > >>> Thanks > > Kevin.. > > >>> > > >>> Please read: > > >>> > > >>> > > >>> > > >>> > http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529 > > >>> > > >>> Those instructions > > are identical for 1.6 and 1.8. > > >>> > > >>> For current lustre, > > only two commands are used for > > >>> configuration. > > >>> mkfs.lustre and > > mount. > > >>> > > >>> > > >>> Usually when > > lustre_rmmod returns that error, you run > > >>> it a second > > >>> time, and it will > > clear things. Unless you have live > > >>> mounts or > > >>> network > > connections. > > >>> > > >>> cliffw > > >>> > > >>> > > >>> I am > > referring to 1.8 manual, but I was also > > >>> referring to > > >>> HowTo > > >>> page on wiki > > which seems to be for 1.6. The HowTo > > >>> page > > >>> > > >>> > > >>> > > >>> > http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools > > >>> mentions abt > > lmc, lconf, and lctl. > > >>> > > >>> The modules > > are installed in the right place. The > > >>> ''$ > > >>> lustre_rmmod'' > > resulted in following o/p: > > >>> > > [root at localhost > > >>> 2.6.18-92.1.17.el5_lustre.1.8.0smp]# > > >>> lustre_rmmod > > >>> ERROR: Module > > obdfilter is in use > > >>> ERROR: Module > > ost is in use > > >>> ERROR: Module > > mds is in use > > >>> ERROR: Module > > fsfilt_ldiskfs is in use > > >>> ERROR: Module > > mgs is in use > > >>> ERROR: Module > > mgc is in use by mgs > > >>> ERROR: Module > > ldiskfs is in use by fsfilt_ldiskfs > > >>> ERROR: Module > > lov is in use > > >>> ERROR: Module > > lquota is in use by obdfilter,mds > > >>> ERROR: Module > > osc is in use > > >>> ERROR: Module > > ksocklnd is in use > > >>> ERROR: Module > > ptlrpc is in use by > > >>> > > obdfilter,ost,mds,mgs,mgc,lov,lquota,osc > > >>> ERROR: Module > > obdclass is in use by > > >>> > > >>> > > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc > > >>> ERROR: Module > > lnet is in use by > > >>> ksocklnd,ptlrpc,obdclass > > >>> ERROR: Module > > lvfs is in use by > > >>> > > >>> > > >>> > > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass > > >>> ERROR: Module > > libcfs is in use by > > >>> > > >>> > > >>> > > >>> > > > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs > > >>> > > >>> Do I need to > > shutdown these services? How can I do > > >>> that? > > >>> > > >>> Thanks, > > >>> CS. > > >>> > > _______________________________________________ > > Lustre-discuss mailing list > > Lustre-discuss at lists.lustre.org > > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090619/a5b24d19/attachment-0001.html