Ms. Megan Larko
2009-Jul-16 22:31 UTC
[Lustre-discuss] One Lustre Client lost One Lustre Disk--solved
Hi, I fixed the problem of the one Lustre client not mounting one Lustre disk. Truthfully, the problem expanded slightly. When I rebooted another client, it also lost contact with this one particular Lustre disk. The error messages were exactly the same: [root at crew01 ~]# mount /crew2 mount.lustre: mount ic-mds1 at o2ib:/crew2 at /crew2 failed: Invalid argument This may have multiple causes. Is ''crew2'' the correct filesystem name? Are the mount options correct? Check the syslog for more info. So, I thought something may have become a bit off with the disk set-up. I had recently upgraded the other MDT disk to a larger physical volume. This was successfully done following instructions in the Lustre Manual. So I thought perhaps the MDT that I did not change merely needed to be "re-set". On the MGS, I unmounted the MDT of the problem disk and ran the following command:>> tunefs.lustre --writeconf --mgs --mdt --fsname=crew2 /dev/{sd-whatever}I then remounted the MDT (which is also the MGS) successfully. On the OSS, I first unmounted the OST disks and then I issued the command:>> tunefs.lustre --writeconf --ost /dev/{sd-whatever}This was issued for each and every OST. I mounted my OSTs again successfully. On my clients, I issued the mount command for the /crew2 lustre disk and it was now successful. No more "invalid argument" message. One client did give me a "Transport endpoint not connected message", so that client will require a re-boot to remount this lustre disk (unless anyone can tell me how to do the re-mount without a reboot of this client). So--- I am guessing that when I did the upgrade in hardware disk size on the non-mgs lustre disk a few weeks ago, the other lustre disk, which functions as the mgs, was left in a state such that I could not pick-up that disk from the clients if I rebooted a client. Is this an accurate guess? If it is, then one may with to add to the Lustre Manual (Ch. 15 in 1.6.x versions on restoring metadata to an mdt disk) that the mgs disk may require an update using tunefs.lustre --writeconf even if it was not the disk which was restored. I may be wrong in my guess, but the above procedure did get my lustre disk back onto my clients successfully. Cheers! megan