thr3ads.net - Lustre discuss - [Lustre-discuss] Serious problem with OSTs [Dec 2010]

If this information is useful, please help other people find it:
Share via:

Mervini, Joseph A

2010-Dec-30 03:22 UTC

[Lustre-discuss] Serious problem with OSTs

Hi,

Over the past weekend we encountered several problems with hardware on one of
our scratch file systems. One fallout from the failures was that one of our
clusters was no longer able to access the file system through the routers. On
closer examination I encountered IO errors when using lctl ping to one of the
OSS servers. Looking at the OSS dmesg showed problem with routers being
unavailable (they were restarted earlier in this troubleshooting exercise). I
shutdown lustre but encountered problems on when trying to restart it. Three of
the OSTs would not mount. I rebooted the system and encountered the same
problems.

So when I tried to mount the OST I get the following:

[root at rio37 ~]# mount -t lustre /dev/sdf /mnt
mount.lustre: mount /dev/sdf at /mnt failed: No such file or directory
Is the MGS specification correct?
Is the filesystem name correct?
If upgrading, is the copied client log valid? (see upgrade docs)

And examining the LUN with tunefs.lustre produces the following:

[root at rio37 ~]# tunefs.lustre /dev/sdf
checking for existing Lustre data: found last_rcvd
tunefs.lustre: Unable to read 1.6 config /tmp/dirUvdBcz/mountdata.
Trying 1.4 config from last_rcvd
Reading last_rcvd
Feature compat=2, incompat=0

   Read previous values:
Target:     
Index:      54
UUID:       ostr)o37sdf_UID
Lustre FS:  lustre
Mount type: ldiskfs
Flags:      0x202
              (OST upgrade1.4 )
Persistent mount opts: 
Parameters:


tunefs.lustre FATAL: Must specify --mgsnodetunefs.lustre: exiting with 22
(Invalid argument)

When compared with a valid target on the same node it is obvious that it is
screwed up:

[root at rio37 ~]# tunefs.lustre /dev/sdd
checking for existing Lustre data: found CONFIGS/mountdata
Reading CONFIGS/mountdata

   Read previous values:
Target:     scratch1-OST0074
Index:      116
UUID:       ostrio37sdd_UUID
Lustre FS:  scratch1
Mount type: ldiskfs
Flags:      0x2
              (OST )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: mgsnode=10.196.135.17 at o2ib1


   Permanent disk data:
Target:     scratch1-OST0074
Index:      116
UUID:       ostrio37sdd_UUID
Lustre FS:  scratch1
Mount type: ldiskfs
Flags:      0x2
              (OST )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: mgsnode=10.196.135.17 at o2ib1

Writing CONFIGS/mountdata

I suspected that there were file system inconsistencies so I ran fsck on one of
the target and got a large number of errors, primarily "Multiply-claimed
blocks" running e2fsck -fp and when it completed the OS told me I needed to
run fsck manually which I did with the "-fy" options. This dumped a
ton of inodes to lost+found. In addition, when it started it converted the file
system from ext3 to ext2 during the fsck and then recreated the journal when it
completed. However, I was still unable to mount the LUN and tunefs.lustre still
had the FATAL condition shown above.

I AM able to mount all of the LUNs as ldiskfs devices so I suspect that the
lustre config for those OSTs just got clobbered somehow. Also, looking at the
inodes that were dumped to lost+found, most of them have timestamps that are
more that a year old that by policy should have been purged so I''m
wondering if it is just an artifact of the file system not being checked for a
very long time.

Other things to note is the OSS is fiber channel attached to a DDN 9500 and the
OSTs that are having problems are associated with one controller of the couplet.
That is suspicious, but because neither controller is showing any faults I
suspect that whatever has occurred did not happen recently.  In addition, the 
/CONFIG/mountdata on all the targets originally had a timestamp of Aug 3 14:05
(and still does for the targets that can''t be mounted).

So I have two questions:

How can I restore the config data on the OSTs that are having problems?

What does "Multiply-claimed blocks" mean and does it indicate
corruption? I am afraid that running e2fsck may have compounded my problems and
am holding off on doing any file system checks on the other 2 target.

Thanks very much for your help in advance.

 
= 
Joe Mervini
Sandia National Laboratories
Dept 09326
PO Box 5800 MS-0823
Albuquerque NM 87185-0823

Andreas Dilger

2010-Dec-30 06:47 UTC

head link

[Lustre-discuss] Serious problem with OSTs

On 2010-12-29, at 20:22, "Mervini, Joseph A" <jamervi at
sandia.gov> wrote:> 
> And examining the LUN with tunefs.lustre produces the following:
> 
> [root at rio37 ~]# tunefs.lustre /dev/sdf
> checking for existing Lustre data: found last_rcvd
> tunefs.lustre: Unable to read 1.6 config /tmp/dirUvdBcz/mountdata.
That means the mountdata file is likely either missing or corrupted somehow. 
>   Read previous values:
> Target:     
> Index:      54
> UUID:       ostr)o37sdf_UID
> Lustre FS:  lustre
> Mount type: ldiskfs
> Flags:      0x202
>              (OST upgrade1.4 )
> Persistent mount opts: 
> Parameters:
> 
> I suspected that there were file system inconsistencies so I ran fsck on
one of the target and got a large number of errors, primarily
"Multiply-claimed blocks" running e2fsck -fp and when it completed the
OS told me I needed to run fsck manually which I did with the "-fy"
options. This dumped a ton of inodes to lost+found. In addition, when it started
it converted the file system from ext3 to ext2 during the fsck and then
recreated the journal when it completed.
There was some sort of device-level corruption in this case. The e2fsck fixed it
as much as possible, and you should run ll_recover_lost_found_objs on the
mounted filesystem.
> However, I was still unable to mount the LUN and tunefs.lustre still had
the FATAL condition shown above.
> 
> I AM able to mount all of the LUNs as ldiskfs devices so I suspect that the
lustre config for those OSTs just got clobbered somehow. Also, looking at the
inodes that were dumped to lost+found, most of them have timestamps that are
more that a year old that by policy should have been purged so I''m
wondering if it is just an artifact of the file system not being checked for a
very long time.
That depends in atime, which is normally only updated on the MDS on disk. 
> Other things to note is the OSS is fiber channel attached to a DDN 9500 and
the OSTs that are having problems are associated with one controller of the
couplet. That is suspicious, but because neither controller is showing any
faults I suspect that whatever has occurred did not happen recently.
It does seem to be the smoking gun.  
> In addition, the  /CONFIG/mountdata on all the targets originally had a
timestamp of Aug 3 14:05 (and still does for the targets that can''t be
mounted).
> 
> So I have two questions:
> 
> How can I restore the config data on the OSTs that are having problems?
I think there was a thread on rebuilding the mountdata file recently. 
> What does "Multiply-claimed blocks" mean and does it indicate
corruption?
Disk-level corruption. 
> I am afraid that running e2fsck may have compounded my problems and am
holding off on doing any file system checks on the other 2 target.
Well, it is needed at some point...

Lustre discuss - Dec 2010 - Serious problem with OSTs

[Lustre-discuss] Serious problem with OSTs

[Lustre-discuss] Serious problem with OSTs