thr3ads.net - Lustre discuss - [Lustre-discuss] OSS kernel panic [Dec 2008]

If this information is useful, please help other people find it:
Share via:

Hummel, Denise

2008-Dec-04 02:30 UTC

[Lustre-discuss] OSS kernel panic

Hi;

We have a lustre filesystem that has been pretty stable since June 2008 on a 200
node
cluster until three weeks ago. The OSS kernel panic has escalated since then to
now about
every 2 hours.
The MDT/MGS is on a x86_64 server with 8G memory and 2 dual core AMD procs
The OSS is on a x86_64 server with 8G memory and 2 dual core AMD procs
One OST raid 6 ~9TB (I know it is larger than currently tested) - at 58%
Lustre 1.6.4.2

I decreased the threads to 256 then 128 thinking the storage was oversubscribed
however the
kernel panics continue. The storage has no errors in the logs. I have done a
fsck with no filesystem
issues detected.
We do have an average of ~35 Gaussian programs running which is heavy I/O,
however collectl
does not show any system stress before the panic. Console shows a few messages
about brw_writes
and OST timeouts. I am attaching the messages from syslog prior to one of the
kernel panics and the one lustre
dump that has data.

If anyone has any thoughts, I would appreciate it.
Denise
-------------- next part --------------
A non-text attachment was scrubbed...
Name: messages
Type: application/octet-stream
Size: 161948 bytes
Desc: messages
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20081203/d1dbe851/attachment-0002.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lustre-log.1228334804.4054
Type: application/octet-stream
Size: 904990 bytes
Desc: lustre-log.1228334804.4054
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20081203/d1dbe851/attachment-0003.obj

Brian J. Murrell

2008-Dec-04 03:25 UTC

head link

[Lustre-discuss] OSS kernel panic

On Wed, 2008-12-03 at 19:30 -0700, Hummel, Denise wrote:> Hi;
I''ve only had a chance to take the quickest of peeks at this...
> We have a lustre filesystem that has been pretty stable since June 2008 on
a 200 node
> cluster until three weeks ago. The OSS kernel panic has escalated since
then to now about
> every 2 hours.
Those are not "panics". A kernel panic is a very particular thing and
what you are seeing is not that. What you are seeing is watchdog timers
firing.

I notice that they mostly (all?) seem to be in ldiskfs code paths and at
the end of the messages a bunch of these:

Dec 3 13:07:36 oss1 kernel: Lustre:
3990:0:(lustre_fsfilt.h:240:fsfilt_brw_start_log()) lustre-OST0000: slow journal
start 112s
Dec 3 13:07:36 oss1 kernel: Lustre:
3942:0:(filter_io_26.c:711:filter_commitrw_write()) lustre-OST0000: slow
brw_start 36s
Dec 3 13:07:36 oss1 kernel: Lustre:
3947:0:(lustre_fsfilt.h:205:fsfilt_start_log()) lustre-OST0000: slow journal
start 128s
Dec 3 13:07:36 oss1 kernel: Lustre: 3947:0:(watchdog.c:312:lcw_update_time())
Expired watchdog for pid 3947 disabled after 128.2092s
Dec 3 13:07:36 oss1 kernel: Lustre:
3988:0:(lustre_fsfilt.h:296:fsfilt_commit_wait()) lustre-OST0000: slow journal
start 150s
Dec 3 13:07:36 oss1 kernel: Lustre:
3988:0:(filter_io_26.c:776:filter_commitrw_write()) lustre-OST0000: slow
commitrw commit 150s
Dec 3 13:07:36 oss1 kernel: Lustre:
4035:0:(filter_io_26.c:763:filter_commitrw_write()) lustre-OST0000: slow
direct_io 31s
Dec 3 13:07:36 oss1 kernel: Lustre:
4053:0:(filter_io_26.c:698:filter_commitrw_write()) lustre-OST0000: slow i_mutex
150s
Dec 3 13:07:36 oss1 kernel: Lustre:
4000:0:(filter_io_26.c:763:filter_commitrw_write()) lustre-OST0000: slow
direct_io 132s
Dec 3 13:07:36 oss1 kernel: Lustre:
4000:0:(filter_io_26.c:763:filter_commitrw_write()) Skipped 10 previous similar
messages
Dec 3 13:07:36 oss1 kernel: Lustre:
4054:0:(filter_io_26.c:776:filter_commitrw_write()) lustre-OST0000: slow
commitrw commit 151s

Which means your storage is too slow for the load that the OSS is
putting on it.
> I decreased the threads to 256 then 128 thinking the storage was
oversubscribed
Your instinct was right as this certainly is a symptom of that problem,
as well as others however.
> The storage has no errors in the logs.
Hrm. That was going to be my next question. This symptom can also
describe a back end storage system that has "slowed down".

Or it could also describe a load that has gone up over time. Perhaps
the storage has always been oversubscribed but just never taxed so the
symptom was hiding.

Did you ever do any iokit benchmarking of your storage before you put
Lustre on it? I hope so, because doing that provides a good baseline
for you do another, say, obdfilter run and compare your performance then
and now to see how they measure up.

Even if you didn''t do a baseline obdfilter-survey run before you
started, doing one now will help you tune the number of OST threads you
can use before you enter the realm of diminishing returns. The
alternative is of course continuing to binary search for your "sweet
spot".

If you choose the latter, once you have found the number of OST threads
you can run with before hitting too many "slow" messages and watchdog
timeouts, you can do some benchmarking to see if your performance is as
you would expect given your storage interconnect and hardware. If not,
you will need to start trying to figure out why.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20081203/a8ed2e88/attachment.bin

Andreas Dilger

2008-Dec-04 16:34 UTC

head link

[Lustre-discuss] OSS kernel panic

On Dec 03, 2008  19:30 -0700, Hummel, Denise wrote:> We have a lustre filesystem that has been pretty stable since June 2008 on
> a 200 node cluster until three weeks ago.  The OSS kernel panic has
> escalated since then to now about every 2 hours.
> The MDT/MGS is on a x86_64 server with 8G memory and 2 dual core AMD procs
> The OSS is on a x86_64 server with 8G memory and 2 dual core AMD procs
> One OST raid 6 ~9TB (I know it is larger than currently tested) - at 58%
Running with OSTs > 8TB exposes you to filesystem corruption.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Peter Kjellstrom

2008-Dec-08 19:40 UTC

head link

[Lustre-discuss] OSS kernel panic

On Thursday 04 December 2008, Andreas Dilger wrote:> On Dec 03, 2008  19:30 -0700, Hummel, Denise wrote:
> > We have a lustre filesystem that has been pretty stable since June
2008
> > on a 200 node cluster until three weeks ago.  The OSS kernel panic has
> > escalated since then to now about every 2 hours.
> > The MDT/MGS is on a x86_64 server with 8G memory and 2 dual core AMD
> > procs The OSS is on a x86_64 server with 8G memory and 2 dual core AMD
> > procs One OST raid 6 ~9TB (I know it is larger than currently tested)
-
> > at 58%
>
> Running with OSTs > 8TB exposes you to filesystem corruption.
>
> Cheers, Andreas
Wouldn''t it be an idea then to warn/refuse during mkfs.lustre?

/Peter
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20081208/de6dbbbb/attachment.bin

Andreas Dilger

2008-Dec-09 00:36 UTC

head link

[Lustre-discuss] OSS kernel panic

On Dec 08, 2008  20:40 +0100, Peter Kjellstrom wrote:> On Thursday 04 December 2008, Andreas Dilger wrote:
> > Running with OSTs > 8TB exposes you to filesystem corruption.
> 
> Wouldn''t it be an idea then to warn/refuse during mkfs.lustre?
Yes, this should be added.

In the past e2fsprogs would refuse to create > 8TB filesystems,
but this was changed in the upstream e2fsprogs for ext4.  However,
we didn''t add an additional restriction back when this was removed
from e2fsprogs.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Lustre discuss - Dec 2008 - OSS kernel panic

[Lustre-discuss] OSS kernel panic

[Lustre-discuss] OSS kernel panic

[Lustre-discuss] OSS kernel panic

[Lustre-discuss] OSS kernel panic

[Lustre-discuss] OSS kernel panic