Heald, Nathan T.
2008-Sep-23 21:55 UTC
[Lustre-discuss] Lustre 1.6.4.3 install/upgrade problem
Hi, I am trying to get lustre 1.6 running on my test cluster (RHEL4) but am running into problems. I was originally trying to upgrade this from Lustre 1.4.10 to 1.6.4.3 and was getting this error when upgrading the mds (Have not touched the oss nodes yet). Then I decided to do a fresh format of the mdt device (Lustre 1.6 install rather than upgrade - page 4.2.1.2 in the Lustre manual), still getting the same error. Before performing the upgrade I removed all Lustre 1.4 rpm''s and installed the 1.6 rpm''s and upgraded my kernel to match my Lustre 1.6 rpm''s. I ran this to make a new filesystem on the mdt: "mkfs.lustre --fsname spfs --mdt --mgs /dev/mpath/mpath2" which seemed to complete ok. When I try to start it I get an error (Same error for an upgrade or fresh install): [root at myhost lustre]# mount -t lustre /dev/mpath/mpath2 /mnt/test/mdt mount.lustre: mount /dev/mpath/mpath2 at /mnt/test/mdt failed: No such device Are the lustre modules loaded? Check /etc/modprobe.conf and /proc/filesystems Note ''alias lustre llite'' should be removed from modprobe.conf I checked my modprobe.conf file to verify the alias in the error message did not exist. And yes /mnt/test/mdt exists. I''ve been watching /var/log/messages but nothing shows up there either. Obviously I''ve missed something, I''m still learning Lustre. Any suggestions would be appreciated. Thanks, -Nathan
Andreas Dilger
2008-Sep-24 06:59 UTC
[Lustre-discuss] Lustre 1.6.4.3 install/upgrade problem
On Sep 23, 2008 17:55 -0400, Heald, Nathan T. wrote:> When I try to start it I get an error (Same error for an upgrade or fresh install): > > [root at myhost lustre]# mount -t lustre /dev/mpath/mpath2 /mnt/test/mdt > mount.lustre: mount /dev/mpath/mpath2 at /mnt/test/mdt failed: No such device > Are the lustre modules loaded? > Check /etc/modprobe.conf and /proc/filesystems > Note ''alias lustre llite'' should be removed from modprobe.confIs /dev/mpath/mpath2 a symlink to some real block device? That can cause problems in some cases. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
Heald, Nathan T.
2008-Sep-24 14:06 UTC
[Lustre-discuss] Lustre 1.6.4.3 install/upgrade problem
Yes it is a symlink. I just now tried mounting the actual block device and am getting the same error unfortunately. Thanks for the suggestion though. -Nathan On 9/24/08 2:59 AM, "Andreas Dilger" <adilger at sun.com> wrote: On Sep 23, 2008 17:55 -0400, Heald, Nathan T. wrote:> When I try to start it I get an error (Same error for an upgrade or fresh install): > > [root at myhost lustre]# mount -t lustre /dev/mpath/mpath2 /mnt/test/mdt > mount.lustre: mount /dev/mpath/mpath2 at /mnt/test/mdt failed: No such device > Are the lustre modules loaded? > Check /etc/modprobe.conf and /proc/filesystems > Note ''alias lustre llite'' should be removed from modprobe.confIs /dev/mpath/mpath2 a symlink to some real block device? That can cause problems in some cases. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
Heald, Nathan T.
2008-Sep-25 00:39 UTC
[Lustre-discuss] Lustre 1.6.4.3 install/upgrade problem
Found my initial problem, my network interface that was being used for lustre communication (myrinet) didn''t come up after the kernel upgrade this time. I resolved that, rebuilt everything from scratch, and have have an upgraded lustre 1.6.4.3 running on my MGS and OSS nodes. However, I''m getting this message on the MGS when I converted my last oss node: Sep 24 19:44:34 mymds kernel: Lustre: upgrading server lustre-OST0001 from pre-1.6 Sep 24 19:44:34 mymds kernel: LustreError: 14c-9: lustre-client is supposedly an old log, but no old LOV or MDT was found. Consider updating the configuration with --writeconf. Sep 24 19:44:34 mymds kernel: LustreError: 149-c: Failed to find lustre-OST0001 in the old client log. Apparently it is not part of this filesystem, or the old log is wrong. Sep 24 19:44:34 mymds kernel: Use ''writeconf'' on the MDT to force log regeneration. Sep 24 19:44:39 mymds kernel: Lustre: upgrading server lustre-OST0003 from pre-1.6 Sep 24 19:44:39 mymds kernel: LustreError: 14c-9: lustre-client is supposedly an old log, but no old LOV or MDT was found. Consider updating the configuration with --writeconf. Sep 24 19:44:39 mymds kernel: LustreError: 149-c: Failed to find lustre-OST0003 in the old client log. Apparently it is not part of this filesystem, or the old log is wrong. Sep 24 19:44:39 mymds kernel: Use ''writeconf'' on the MDT to force log regeneration. I can''t mount a lustre client yet so I assume this is directly related, any suggestions are welcomed. Thanks again, -Nathan On 9/24/08 10:06 AM, "Nathan" <nheald at indiana.edu> wrote: Yes it is a symlink. I just now tried mounting the actual block device and am getting the same error unfortunately. Thanks for the suggestion though. -Nathan On 9/24/08 2:59 AM, "Andreas Dilger" <adilger at sun.com> wrote: On Sep 23, 2008 17:55 -0400, Heald, Nathan T. wrote:> When I try to start it I get an error (Same error for an upgrade or fresh install): > > [root at myhost lustre]# mount -t lustre /dev/mpath/mpath2 /mnt/test/mdt > mount.lustre: mount /dev/mpath/mpath2 at /mnt/test/mdt failed: No such device > Are the lustre modules loaded? > Check /etc/modprobe.conf and /proc/filesystems > Note ''alias lustre llite'' should be removed from modprobe.confIs /dev/mpath/mpath2 a symlink to some real block device? That can cause problems in some cases. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Wojciech Turek
2008-Sep-25 12:13 UTC
[Lustre-discuss] Lustre 1.6.4.3 install/upgrade problem
Hi, I think what this message means is that you should run tunefs.lustre --writeconf /dev/<block_device> with every disk (on oss''s and mds''s) Before you do that stop lustre by umounting all lustre devices This command recreates all lustre configuration files to the state like in newly created file system. This command will not delete or corrupt any data you have on your file system it''s just touches lustre configuration and log files. When you start your lustre devices they will obtain new configuration from MGS and they will recreate configuration and log files and the messages you see should go away. I hope this helps. Regards, Wojciech Heald, Nathan T. wrote:> Found my initial problem, my network interface that was being used for lustre communication (myrinet) didn''t come up after the kernel upgrade this time. I resolved that, rebuilt everything from scratch, and have have an upgraded lustre 1.6.4.3 running on my MGS and OSS nodes. However, I''m getting this message on the MGS when I converted my last oss node: > > Sep 24 19:44:34 mymds kernel: Lustre: upgrading server lustre-OST0001 from pre-1.6 > Sep 24 19:44:34 mymds kernel: LustreError: 14c-9: lustre-client is supposedly an old log, but no old LOV or MDT was found. Consider updating the configuration with --writeconf. > Sep 24 19:44:34 mymds kernel: LustreError: 149-c: Failed to find lustre-OST0001 in the old client log. Apparently it is not part of this filesystem, or the old log is wrong. > Sep 24 19:44:34 mymds kernel: Use ''writeconf'' on the MDT to force log regeneration. > Sep 24 19:44:39 mymds kernel: Lustre: upgrading server lustre-OST0003 from pre-1.6 > Sep 24 19:44:39 mymds kernel: LustreError: 14c-9: lustre-client is supposedly an old log, but no old LOV or MDT was found. Consider updating the configuration with --writeconf. > Sep 24 19:44:39 mymds kernel: LustreError: 149-c: Failed to find lustre-OST0003 in the old client log. Apparently it is not part of this filesystem, or the old log is wrong. > Sep 24 19:44:39 mymds kernel: Use ''writeconf'' on the MDT to force log regeneration. > > > I can''t mount a lustre client yet so I assume this is directly related, any suggestions are welcomed. > > Thanks again, > -Nathan > > > > > On 9/24/08 10:06 AM, "Nathan" <nheald at indiana.edu> wrote: > > Yes it is a symlink. I just now tried mounting the actual block device and am getting the same error unfortunately. Thanks for the suggestion though. > > -Nathan > > > On 9/24/08 2:59 AM, "Andreas Dilger" <adilger at sun.com> wrote: > > On Sep 23, 2008 17:55 -0400, Heald, Nathan T. wrote: > >> When I try to start it I get an error (Same error for an upgrade or fresh install): >> >> [root at myhost lustre]# mount -t lustre /dev/mpath/mpath2 /mnt/test/mdt >> mount.lustre: mount /dev/mpath/mpath2 at /mnt/test/mdt failed: No such device >> Are the lustre modules loaded? >> Check /etc/modprobe.conf and /proc/filesystems >> Note ''alias lustre llite'' should be removed from modprobe.conf >> > > Is /dev/mpath/mpath2 a symlink to some real block device? That can cause > problems in some cases. > > Cheers, Andreas > -- > Andreas Dilger > Sr. Staff Engineer, Lustre Group > Sun Microsystems of Canada, Inc. > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >-- Wojciech Turek Assistant System Manager High Performance Computing Service University of Cambridge Email: wjt27 at cam.ac.uk Tel: (+)44 1223 763517