thr3ads.net - Lustre discuss - [Lustre-discuss] OSS not healty [Mar 2008]

If this information is useful, please help other people find it:
Share via:

Frank Mietke

2008-Mar-13 09:15 UTC

[Lustre-discuss] OSS not healty

Hi,

we''re using Lustre-1.6.4.2 and now one of our OSS (comprising two OSTs)
shows
the status "not healthy". 

dmesg tells the following:
...
[3082673.456429] LustreError:
16561:0:(filter_io_26.c:705:filter_commitrw_write()) error starting transaction:
rc = -30

I''ve found that it seems to be the error EROFS. The documentation
states that I
have to restart Lustre services. Is it enough to umount / mount both OSTs on
this OSS or do I have to umount everything (MDS/OSS)? Anything else to care
about?

Best Regards,
Frank




-- 
Dipl.-Inf. Frank Mietke     |     Fakult?tsrechen- und Informationszentrum
Tel.: 0371 - 531 - 35538    |     Fak. f?r Informatik
Fax:  0371 - 531 8 35538    |     TU-Chemnitz
Key-ID: 60F59599            |     frank.mietke at informatik.tu-chemnitz.de

Brian J. Murrell

2008-Mar-13 09:27 UTC

head link

[Lustre-discuss] OSS not healty

On Thu, 2008-03-13 at 10:15 +0100, Frank Mietke wrote:> 
> I''ve found that it seems to be the error EROFS. The documentation
states that I
> have to restart Lustre services. Is it enough to umount / mount both OSTs
on
> this OSS or do I have to umount everything (MDS/OSS)? Anything else to care
> about?
A remount of the read-only OSTs should be enough, but you might want to
investigate why they went RO in the first place.

b.

Andreas Dilger

2008-Mar-13 09:29 UTC

head link

[Lustre-discuss] OSS not healty

On Mar 13, 2008  10:15 +0100, Frank Mietke wrote:> we''re using Lustre-1.6.4.2 and now one of our OSS (comprising two
OSTs) shows
> the status "not healthy". 
> 
> dmesg tells the following:
> ...
> [3082673.456429] LustreError:
> 16561:0:(filter_io_26.c:705:filter_commitrw_write()) error starting
transaction:
> rc = -30
> 
> I''ve found that it seems to be the error EROFS. The documentation
states that I
> have to restart Lustre services. Is it enough to umount / mount both OSTs
on
> this OSS or do I have to umount everything (MDS/OSS)? Anything else to care
> about?
You should investigate in your /var/log/messages why this happened.  It
is usually a sign of filesystem corruption or disk errors, so you would
likely also need to run e2fsck before remounting the filesystem.

Doing the unmount/mount of just the OSTs should be enough

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Frank Mietke

2008-Mar-13 11:34 UTC

head link

[Lustre-discuss] OSS not healty

Hi,

On Thu, Mar 13, 2008 at 03:29:29AM -0600, Andreas Dilger
wrote:> On Mar 13, 2008  10:15 +0100, Frank Mietke wrote:
> > we''re using Lustre-1.6.4.2 and now one of our OSS (comprising
two OSTs) shows
> > the status "not healthy". 
> > 
> > dmesg tells the following:
> > ...
> > [3082673.456429] LustreError:
> > 16561:0:(filter_io_26.c:705:filter_commitrw_write()) error starting
transaction:
> > rc = -30
> > 
> > I''ve found that it seems to be the error EROFS. The
documentation states that I
> > have to restart Lustre services. Is it enough to umount / mount both
OSTs on
> > this OSS or do I have to umount everything (MDS/OSS)? Anything else to
care
> > about?
> 
> You should investigate in your /var/log/messages why this happened.  It
> is usually a sign of filesystem corruption or disk errors, so you would
> likely also need to run e2fsck before remounting the filesystem.okay I''ve found the following in /var/log/messages before the bulk of
above
messages come. It seems that something with the RAID went wrong. Any hints?

Mar 13 05:50:37 chic2e24 kernel: [3067020.190468] LustreError:
4574:0:(ldlm_resource.c:719:ldlm_resource_add()) lvbo_init failed for resource
116733: rc -2
Mar 13 05:50:37 chic2e24 kernel: [3067020.190907] LustreError:
4574:0:(ldlm_resource.c:719:ldlm_resource_add()) Skipped 1 previous similar
message
Mar 13 05:50:57 chic2e24 kernel: [3067040.964208] LustreError:
4598:0:(ldlm_resource.c:719:ldlm_resource_add()) lvbo_init failed for resource
10518: rc -2
Mar 13 05:50:57 chic2e24 kernel: [3067040.964652] LustreError:
4598:0:(ldlm_resource.c:719:ldlm_resource_add()) Skipped 2 previous similar
messages
Mar 13 06:17:31 chic2e24 kernel: [3068633.701448] attempt to access beyond end
of device
Mar 13 06:17:31 chic2e24 kernel: [3068633.701454] sda: rw=1, want=11287722456,
limit=7796867072
Mar 13 06:17:31 chic2e24 kernel: [3068633.701555] attempt to access beyond end
of device
Mar 13 06:17:31 chic2e24 kernel: [3068633.701558] sda: rw=1, want=25366292592,
limit=7796867072
Mar 13 06:17:31 chic2e24 kernel: [3068633.701562] Buffer I/O error on device
sda, logical block 3170786573
Mar 13 06:17:31 chic2e24 kernel: [3068633.701785] lost page write due to I/O
error on sda
Mar 13 06:17:31 chic2e24 kernel: [3068633.702004] Aborting journal on device
sda.
Mar 13 06:17:31 chic2e24 kernel: [3068633.702226] LustreError:
4493:0:(obd.h:1038:obd_transno_commit_cb()) chicfs-OST0010: transno
6510615555435490347 commit error: 2 
Mar 13 06:17:31 chic2e24 kernel: [3068633.702933] LDISKFS-fs error (device sda)
in ldiskfs_reserve_inode_write: Journal has aborted
Mar 13 06:17:31 chic2e24 kernel: [3068633.703587] Remounting filesystem
read-only
Mar 13 06:17:31 chic2e24 kernel: [3068633.704001] journal commit I/O error
Mar 13 06:17:31 chic2e24 kernel: [3068633.704981] LDISKFS-fs error (device sda)
in ldiskfs_dirty_inode: Journal has aborted
Mar 13 06:17:31 chic2e24 kernel: [3068633.705034] LustreError:
5887:0:(filter_io_26.c:767:filter_commitrw_write()) Failure to commit OST
transaction (-5)?
Mar 13 06:17:31 chic2e24 kernel: [3068633.706134] LustreError:
4662:0:(fsfilt-ldiskfs.c:1318:fsfilt_ldiskfs_write_record()) can''t
start transaction for 37 blocks (128 bytes)
Mar 13 06:17:31 chic2e24 kernel: [3068633.706718] LustreError:
4662:0:(filter.c:139:filter_finish_transno()) wrote trans 6510615555435490348
for client 67e1aea3-f93a-affd-b39d-eefa306ae345 at #212: err = -30
Mar 13 06:17:31 chic2e24 kernel: [3068633.707570] LustreError:
4662:0:(filter_io_26.c:566:filter_direct_io()) can''t close transaction:
-30
Mar 13 06:17:31 chic2e24 kernel: [3068633.708153] LustreError:
4662:0:(fsfilt-ldiskfs.c:483:fsfilt_ldiskfs_commit_async()) error while stopping
transaction: -30
Mar 13 06:17:31 chic2e24 kernel: [3068633.708735] LustreError:
4662:0:(filter_io_26.c:767:filter_commitrw_write()) Failure to commit OST
transaction (-5)?
Mar 13 06:17:31 chic2e24 kernel: [3068633.708875] LustreError:
16324:0:(fsfilt-ldiskfs.c:417:fsfilt_ldiskfs_brw_start()) can''t get
handle for 530 credits: rc = -30
Mar 13 06:17:31 chic2e24 kernel: [3068633.708881] LustreError:
16324:0:(filter_io_26.c:705:filter_commitrw_write()) error starting transaction:
rc = -30
Mar 13 06:17:31 chic2e24 kernel: [3068633.708976] LustreError:
4776:0:(filter_io_26.c:705:filter_commitrw_write()) error starting transaction:
rc = -30
Mar 13 06:17:31 chic2e24 kernel: [3068633.709006] LustreError:
4742:0:(filter_io_26.c:705:filter_commitrw_write()) error starting transaction:
rc = -30
Mar 13 06:17:31 chic2e24 kernel: [3068633.711072] LustreError:
4493:0:(obd.h:1038:obd_transno_commit_cb()) chicfs-OST0010: transno
6510615555435490348 commit error: 2
Mar 13 06:17:31 chic2e24 kernel: [3068633.711100] LustreError:
16385:0:(fsfilt-ldiskfs.c:417:fsfilt_ldiskfs_brw_start()) can''t get
handle for 530 credits: rc = -30
Mar 13 06:17:31 chic2e24 kernel: [3068633.711105] LustreError:
16385:0:(fsfilt-ldiskfs.c:417:fsfilt_ldiskfs_brw_start()) Skipped 2 previous
similar messages
Mar 13 06:17:31 chic2e24 kernel: [3068633.711110] LustreError:
16385:0:(filter_io_26.c:705:filter_commitrw_write()) error starting transaction:
rc = -30

Best Regards,
Frank

> 
> Doing the unmount/mount of just the OSTs should be enough
> 
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
> 
> 
-- 
Dipl.-Inf. Frank Mietke     |     Fakult?tsrechen- und Informationszentrum
Tel.: 0371 - 531 - 35538    |     Fak. f?r Informatik
Fax:  0371 - 531 8 35538    |     TU-Chemnitz
Key-ID: 60F59599            |     frank.mietke at informatik.tu-chemnitz.de

Brian J. Murrell

2008-Mar-13 12:44 UTC

head link

[Lustre-discuss] OSS not healty

On Thu, 2008-03-13 at 12:34 +0100, Frank Mietke wrote:
> okay I''ve found the following in /var/log/messages before the bulk
of above
> messages come. It seems that something with the RAID went wrong.
I don''t see anything RAID specific however...
> Mar 13 06:17:31 chic2e24 kernel: [3068633.701448] attempt to access beyond
end of device
> Mar 13 06:17:31 chic2e24 kernel: [3068633.701454] sda: rw=1,
want=11287722456, limit=7796867072
This is pretty self-explanatory.  Something tried to read beyond the end
of the disk.  Something has a misunderstanding of how big the disk is.
Is it possible that the disk format process was misled about the disk
size during initialization?

Andreas, does mkfs do any bounds checking to verify the sanity of the
mkfs request?  I.e. does it make sure that if/when you specify a number
of blocks for a filesystem that that many block are available?

Frank, is it at all possible that the size of the device had somehow
gotten smaller since you first initialized it?
> Mar 13 06:17:31 chic2e24 kernel: [3068633.701555] attempt to access beyond
end of device
> Mar 13 06:17:31 chic2e24 kernel: [3068633.701558] sda: rw=1,
want=25366292592, limit=7796867072
> Mar 13 06:17:31 chic2e24 kernel: [3068633.701562] Buffer I/O error on
device sda, logical block 3170786573
> Mar 13 06:17:31 chic2e24 kernel: [3068633.701785] lost page write due to
I/O error on sda
> Mar 13 06:17:31 chic2e24 kernel: [3068633.702004] Aborting journal on
device sda.
This is all just fallout error messages from the attempted read beyond
EOF.
> Mar 13 06:17:31 chic2e24 kernel: [3068633.702226] LustreError:
4493:0:(obd.h:1038:obd_transno_commit_cb()) chicfs-OST0010: transno
> 6510615555435490347 commit error: 2 
> Mar 13 06:17:31 chic2e24 kernel: [3068633.702933] LDISKFS-fs error (device
sda) in ldiskfs_reserve_inode_write: Journal has aborted
> Mar 13 06:17:31 chic2e24 kernel: [3068633.703587] Remounting filesystem
read-only
> Mar 13 06:17:31 chic2e24 kernel: [3068633.704001] journal commit I/O error
> Mar 13 06:17:31 chic2e24 kernel: [3068633.704981] LDISKFS-fs error (device
sda) in ldiskfs_dirty_inode: Journal has aborted
And this is the ldiskfs fallout.

b.

Frank Mietke

2008-Mar-13 13:55 UTC

head link

[Lustre-discuss] OSS not healty

Brian,

On Thu, Mar 13, 2008 at 01:44:45PM +0100, Brian J. Murrell
wrote:> On Thu, 2008-03-13 at 12:34 +0100, Frank Mietke wrote:
> 
> > okay I''ve found the following in /var/log/messages before the
bulk of above
> > messages come. It seems that something with the RAID went wrong.
> 
> I don''t see anything RAID specific however...
you''re right, my mistake. 

> 
> > Mar 13 06:17:31 chic2e24 kernel: [3068633.701448] attempt to access
beyond end of device
> > Mar 13 06:17:31 chic2e24 kernel: [3068633.701454] sda: rw=1,
want=11287722456, limit=7796867072
> 
> This is pretty self-explanatory.  Something tried to read beyond the end
> of the disk.  Something has a misunderstanding of how big the disk is.
That''s it why I''m asking. 
> Is it possible that the disk format process was misled about the disk
> size during initialization?
> 
> Andreas, does mkfs do any bounds checking to verify the sanity of the
> mkfs request?  I.e. does it make sure that if/when you specify a number
> of blocks for a filesystem that that many block are available?
> 
> Frank, is it at all possible that the size of the device had somehow
> gotten smaller since you first initialized it?
I think, no, because all the other OSTs show the same size. Is there a way to
request the assumptions of disk size from the MGS/MDS?

Frank




-- 
Dipl.-Inf. Frank Mietke     |     Fakult?tsrechen- und Informationszentrum
Tel.: 0371 - 531 - 35538    |     Fak. f?r Informatik
Fax:  0371 - 531 8 35538    |     TU-Chemnitz
Key-ID: 60F59599            |     frank.mietke at informatik.tu-chemnitz.de

Brian J. Murrell

2008-Mar-13 14:01 UTC

head link

[Lustre-discuss] OSS not healty

On Thu, 2008-03-13 at 14:55 +0100, Frank Mietke wrote:> 
> I think, no, because all the other OSTs show the same size. Is there a way
to
> request the assumptions of disk size from the MGS/MDS?
The MGS/MDS just uses an underlying (enhanced) ext3 filesystem we call
ldiskfs.  If you install the latest version of our e2fsprogs you can use
debugfs'' "stat" command to get the various parameters of the
filesystem.
You can use the block count and block size to calculate the size that
ext3/ldiskfs thinks it is and compare that to the size
that /proc/partitions thinks it is.

b.

Andreas Dilger

2008-Mar-13 18:11 UTC

head link

[Lustre-discuss] OSS not healty

On Mar 13, 2008  13:44 +0100, Brian J. Murrell wrote:> On Thu, 2008-03-13 at 12:34 +0100, Frank Mietke wrote:
> > Mar 13 06:17:31 chic2e24 kernel: [3068633.701448] attempt to access
beyond end of device
> > Mar 13 06:17:31 chic2e24 kernel: [3068633.701454] sda: rw=1,
want=11287722456, limit=7796867072
> 
> This is pretty self-explanatory.  Something tried to read beyond the end
> of the disk.  Something has a misunderstanding of how big the disk is.
> Is it possible that the disk format process was misled about the disk
> size during initialization?
Unlikely.
> Andreas, does mkfs do any bounds checking to verify the sanity of the
> mkfs request?  I.e. does it make sure that if/when you specify a number
> of blocks for a filesystem that that many block are available?
Yes, mke2fs will zero out the last ~128kB of the device to overwrite any
MD RAID signatures, and also verify that the device is as big as requested.

These kind of errors are usually a result of corruption internal to the
filesystem, and some garbage is interpreted as a block number beyond the
end of the device.
> > Mar 13 06:17:31 chic2e24 kernel: [3068633.701555] attempt to access
beyond end of device
> > Mar 13 06:17:31 chic2e24 kernel: [3068633.701558] sda: rw=1,
want=25366292592, limit=7796867072
> > Mar 13 06:17:31 chic2e24 kernel: [3068633.701562] Buffer I/O error on
device sda, logical block 3170786573
> > Mar 13 06:17:31 chic2e24 kernel: [3068633.701785] lost page write due
to I/O error on sda
> > Mar 13 06:17:31 chic2e24 kernel: [3068633.702004] Aborting journal on
device sda.
> 
> This is all just fallout error messages from the attempted read beyond
> EOF.
Time to unmount the filesystem and run a full e2fsck "e2fsck -fp
/dev/sdaNNN"

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Frank Mietke

2008-Mar-14 16:15 UTC

head link

[Lustre-discuss] OSS not healty

Hi,
> > > Mar 13 06:17:31 chic2e24 kernel: [3068633.701562] Buffer I/O
error on device sda, logical block 3170786573
> > > Mar 13 06:17:31 chic2e24 kernel: [3068633.701785] lost page write
due to I/O error on sda
> > > Mar 13 06:17:31 chic2e24 kernel: [3068633.702004] Aborting
journal on device sda.
> > 
> > This is all just fallout error messages from the attempted read beyond
> > EOF.
> 
> Time to unmount the filesystem and run a full e2fsck "e2fsck -fp
/dev/sdaNNN"
I did a e2fsck run. It recovered the ext3 journal and found a handful
of bad blocks which were corrected.

Thanks,
Frank

> 
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 
-- 
Dipl.-Inf. Frank Mietke     |     Fakult?tsrechen- und Informationszentrum
Tel.: 0371 - 531 - 35538    |     Fak. f?r Informatik
Fax:  0371 - 531 8 35538    |     TU-Chemnitz
Key-ID: 60F59599            |     frank.mietke at informatik.tu-chemnitz.de

Lustre discuss - Mar 2008 - OSS not healty

[Lustre-discuss] OSS not healty

[Lustre-discuss] OSS not healty

[Lustre-discuss] OSS not healty

[Lustre-discuss] OSS not healty

[Lustre-discuss] OSS not healty

[Lustre-discuss] OSS not healty

[Lustre-discuss] OSS not healty

[Lustre-discuss] OSS not healty

[Lustre-discuss] OSS not healty