our OSS went crazy today. It is attached to two OST''s. The load normally around 2-4. Right now it is 123. I noticed this to be the cause: root 6748 0.0 0.0 0 0 ? D May27 8:57 [ll_ost_io_123] All of them are stuck in un-interruptible sleep. Has anyone seen this happen before? Is this caused by a pending disk failure? I ask the disk system failure because I also see this message: mptscsi: ioc1: attempting task abort! (sc=0000010038904c40) scsi1 : destination target 0, lun 0 command = Read (10) 00 75 94 40 00 00 10 00 00 mptscsi: ioc1: task abort: SUCCESS (sc=0000010038904c40) and: Lustre: 6698:0:(lustre_fsfilt.h:306:fsfilt_setattr()) nobackup- OST0001: slow setattr 100s Lustre: 6698:0:(watchdog.c:312:lcw_update_time()) Expired watchdog for pid 6698 disabled after 103.1261s Thanks Brock Palen www.umich.edu/~brockp Center for Advanced Computing brockp at umich.edu (734)936-1985
On Fri, 2008-06-27 at 12:44 -0400, Brock Palen wrote:> > All of them are stuck in un-interruptible sleep. > Has anyone seen this happen before? Is this caused by a pending disk > failure?Well, they are certainly stuck because of some blocking I/O. That could be disk failure, indeed.> mptscsi: ioc1: attempting task abort! (sc=0000010038904c40) > scsi1 : destination target 0, lun 0 > command = Read (10) 00 75 94 40 00 00 10 00 00 > mptscsi: ioc1: task abort: SUCCESS (sc=0000010038904c40)That does not look like a picture of happiness, indeed, no. You have SCSI commands aborting.> Lustre: 6698:0:(lustre_fsfilt.h:306:fsfilt_setattr()) nobackup- > OST0001: slow setattr 100s > Lustre: 6698:0:(watchdog.c:312:lcw_update_time()) Expired watchdog > for pid 6698 disabled after 103.1261sThose are just fallout from the above disk situation. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080627/a4c33f34/attachment.bin
On Fri, Jun 27, 2008 at 01:07:32PM -0400, Brian J. Murrell wrote:> On Fri, 2008-06-27 at 12:44 -0400, Brock Palen wrote: > > > > All of them are stuck in un-interruptible sleep. > > Has anyone seen this happen before? Is this caused by a pending disk > > failure? > > Well, they are certainly stuck because of some blocking I/O. That could > be disk failure, indeed. > > > mptscsi: ioc1: attempting task abort! (sc=0000010038904c40) > > scsi1 : destination target 0, lun 0 > > command = Read (10) 00 75 94 40 00 00 10 00 00 > > mptscsi: ioc1: task abort: SUCCESS (sc=0000010038904c40) > > That does not look like a picture of happiness, indeed, no. You have > SCSI commands aborting. >Well, these messages are not nice of course, since the mpt error handler got activated, but in principle a scsi device can recover then. Unfortunately, the verbosity level of scsi makes it impossbible to figure out what was actually the problem. Since we suffered from severe scsi problems, I wrote quite a number of patches to improve the situation. We now at least can understand where the problem came from and also have a slightly improved error handling. These are presently for 2.6.22 only, but my plan is to sent these upstream for 2.6.28.> > Lustre: 6698:0:(lustre_fsfilt.h:306:fsfilt_setattr()) nobackup- > > OST0001: slow setattr 100s > > Lustre: 6698:0:(watchdog.c:312:lcw_update_time()) Expired watchdog > > for pid 6698 disabled after 103.1261s > > Those are just fallout from the above disk situation.Probably the device was offlined and actually this also should have been printed in the logs. Brock, can you check the device status (cat /sys/block/sdX/device/state). Cheers, Bernd
On Jun 27, 2008, at 1:39 PM, Bernd Schubert wrote:> On Fri, Jun 27, 2008 at 01:07:32PM -0400, Brian J. Murrell wrote: >> On Fri, 2008-06-27 at 12:44 -0400, Brock Palen wrote: >>> >>> All of them are stuck in un-interruptible sleep. >>> Has anyone seen this happen before? Is this caused by a pending >>> disk >>> failure? >> >> Well, they are certainly stuck because of some blocking I/O. That >> could >> be disk failure, indeed. >> >>> mptscsi: ioc1: attempting task abort! (sc=0000010038904c40) >>> scsi1 : destination target 0, lun 0 >>> command = Read (10) 00 75 94 40 00 00 10 00 00 >>> mptscsi: ioc1: task abort: SUCCESS (sc=0000010038904c40) >> >> That does not look like a picture of happiness, indeed, no. You have >> SCSI commands aborting. >> > > Well, these messages are not nice of course, since the mpt error > handler > got activated, but in principle a scsi device can recover then. > Unfortunately, the verbosity level of scsi makes it impossbible to > figure out what was actually the problem. Since we suffered from > severe > scsi problems, I wrote quite a number of patches to improve the > situation. > We now at least can understand where the problem came from and also > have > a slightly improved error handling. These are presently for 2.6.22 > only, > but my plan is to sent these upstream for 2.6.28. > > >>> Lustre: 6698:0:(lustre_fsfilt.h:306:fsfilt_setattr()) nobackup- >>> OST0001: slow setattr 100s >>> Lustre: 6698:0:(watchdog.c:312:lcw_update_time()) Expired watchdog >>> for pid 6698 disabled after 103.1261s >> >> Those are just fallout from the above disk situation. > > Probably the device was offlined and actually this also should have > been > printed in the logs. Brock, can you check the device status > (cat /sys/block/sdX/device/state).IO Is still flowing from both OST''s on that OSS, [root at nyx167 ~]# cat /sys/block/sd*/device/state running running Sigh, it only needs to live till August when we install our x4500''s. I think its safe to send a notice to users they may want to copy their data.> > Cheers, > Bernd > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss > >
On Fri, Jun 27, 2008 at 01:44:13PM -0400, Brock Palen wrote:> On Jun 27, 2008, at 1:39 PM, Bernd Schubert wrote: >> On Fri, Jun 27, 2008 at 01:07:32PM -0400, Brian J. Murrell wrote: >>> On Fri, 2008-06-27 at 12:44 -0400, Brock Palen wrote: >>>> >>>> All of them are stuck in un-interruptible sleep. >>>> Has anyone seen this happen before? Is this caused by a pending >>>> disk >>>> failure? >>> >>> Well, they are certainly stuck because of some blocking I/O. That >>> could >>> be disk failure, indeed. >>> >>>> mptscsi: ioc1: attempting task abort! (sc=0000010038904c40) >>>> scsi1 : destination target 0, lun 0 >>>> command = Read (10) 00 75 94 40 00 00 10 00 00 >>>> mptscsi: ioc1: task abort: SUCCESS (sc=0000010038904c40) >>> >>> That does not look like a picture of happiness, indeed, no. You have >>> SCSI commands aborting. >>> >> >> Well, these messages are not nice of course, since the mpt error >> handler >> got activated, but in principle a scsi device can recover then. >> Unfortunately, the verbosity level of scsi makes it impossbible to >> figure out what was actually the problem. Since we suffered from >> severe >> scsi problems, I wrote quite a number of patches to improve the >> situation. >> We now at least can understand where the problem came from and also >> have >> a slightly improved error handling. These are presently for 2.6.22 >> only, >> but my plan is to sent these upstream for 2.6.28. >> >> >>>> Lustre: 6698:0:(lustre_fsfilt.h:306:fsfilt_setattr()) nobackup- >>>> OST0001: slow setattr 100s >>>> Lustre: 6698:0:(watchdog.c:312:lcw_update_time()) Expired watchdog >>>> for pid 6698 disabled after 103.1261s >>> >>> Those are just fallout from the above disk situation. >> >> Probably the device was offlined and actually this also should have >> been >> printed in the logs. Brock, can you check the device status >> (cat /sys/block/sdX/device/state). > > IO Is still flowing from both OST''s on that OSS, > > [root at nyx167 ~]# cat /sys/block/sd*/device/state > running > runningSo the device recovered. Is the parallel-scsi? If so it now might run at a lower scsi speed level, but you should have got domain validation messages about this (unless you are using a customized driver, which has DV disabled). Cheers, Bernd
On Jun 27, 2008, at 2:22 PM, Bernd Schubert wrote:> On Fri, Jun 27, 2008 at 01:44:13PM -0400, Brock Palen wrote: >> On Jun 27, 2008, at 1:39 PM, Bernd Schubert wrote: >>> On Fri, Jun 27, 2008 at 01:07:32PM -0400, Brian J. Murrell wrote: >>>> On Fri, 2008-06-27 at 12:44 -0400, Brock Palen wrote: >>>>> >>>>> All of them are stuck in un-interruptible sleep. >>>>> Has anyone seen this happen before? Is this caused by a pending >>>>> disk >>>>> failure? >>>> >>>> Well, they are certainly stuck because of some blocking I/O. That >>>> could >>>> be disk failure, indeed. >>>> >>>>> mptscsi: ioc1: attempting task abort! (sc=0000010038904c40) >>>>> scsi1 : destination target 0, lun 0 >>>>> command = Read (10) 00 75 94 40 00 00 10 00 00 >>>>> mptscsi: ioc1: task abort: SUCCESS (sc=0000010038904c40) >>>> >>>> That does not look like a picture of happiness, indeed, no. You >>>> have >>>> SCSI commands aborting. >>>> >>> >>> Well, these messages are not nice of course, since the mpt error >>> handler >>> got activated, but in principle a scsi device can recover then. >>> Unfortunately, the verbosity level of scsi makes it impossbible to >>> figure out what was actually the problem. Since we suffered from >>> severe >>> scsi problems, I wrote quite a number of patches to improve the >>> situation. >>> We now at least can understand where the problem came from and also >>> have >>> a slightly improved error handling. These are presently for 2.6.22 >>> only, >>> but my plan is to sent these upstream for 2.6.28. >>> >>> >>>>> Lustre: 6698:0:(lustre_fsfilt.h:306:fsfilt_setattr()) nobackup- >>>>> OST0001: slow setattr 100s >>>>> Lustre: 6698:0:(watchdog.c:312:lcw_update_time()) Expired watchdog >>>>> for pid 6698 disabled after 103.1261s >>>> >>>> Those are just fallout from the above disk situation. >>> >>> Probably the device was offlined and actually this also should have >>> been >>> printed in the logs. Brock, can you check the device status >>> (cat /sys/block/sdX/device/state). >> >> IO Is still flowing from both OST''s on that OSS, >> >> [root at nyx167 ~]# cat /sys/block/sd*/device/state >> running >> running > > So the device recovered. Is the parallel-scsi? If so it now might > run at > a lower scsi speed level, but you should have got domain validation > messages > about this (unless you are using a customized driver, which has DV > disabled).Its Fibre Channel for the medium. Direct connected (no loop or switch) So I am not sure, the driver is the stock one with RHEL4.> > > Cheers, > Bernd > >
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Jun 27, 2008, at 1:07 PM, Brian J. Murrell wrote:> On Fri, 2008-06-27 at 12:44 -0400, Brock Palen wrote: >> >> All of them are stuck in un-interruptible sleep. >> Has anyone seen this happen before? Is this caused by a pending disk >> failure? > > Well, they are certainly stuck because of some blocking I/O. That > could > be disk failure, indeed. > >> mptscsi: ioc1: attempting task abort! (sc=0000010038904c40) >> scsi1 : destination target 0, lun 0 >> command = Read (10) 00 75 94 40 00 00 10 00 00 >> mptscsi: ioc1: task abort: SUCCESS (sc=0000010038904c40) > > That does not look like a picture of happiness, indeed, no. You have > SCSI commands aborting.While the array was reporting no problems one of the disk was really lagging the others. We have swapped it out. Thanks for the feedback everyone.> >> Lustre: 6698:0:(lustre_fsfilt.h:306:fsfilt_setattr()) nobackup- >> OST0001: slow setattr 100s >> Lustre: 6698:0:(watchdog.c:312:lcw_update_time()) Expired watchdog >> for pid 6698 disabled after 103.1261s > > Those are just fallout from the above disk situation. > > b. > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (Darwin) iD8DBQFIZUq/MFCQB4Bvz5QRAvacAJ9jkhi+2KgfbJ7bUI/KfHJ0Hnq1wQCeNgHO d6+tzscwCqwYtuHXmzT2kFI=5p1N -----END PGP SIGNATURE-----
On Fri, Jun 27, 2008 at 02:29:24PM -0400, Brock Palen wrote:> On Jun 27, 2008, at 2:22 PM, Bernd Schubert wrote: >> On Fri, Jun 27, 2008 at 01:44:13PM -0400, Brock Palen wrote: >>> On Jun 27, 2008, at 1:39 PM, Bernd Schubert wrote: >>>> On Fri, Jun 27, 2008 at 01:07:32PM -0400, Brian J. Murrell wrote: >>>>> On Fri, 2008-06-27 at 12:44 -0400, Brock Palen wrote: >>>>>> >>>>>> All of them are stuck in un-interruptible sleep. >>>>>> Has anyone seen this happen before? Is this caused by a pending >>>>>> disk >>>>>> failure? >>>>> >>>>> Well, they are certainly stuck because of some blocking I/O. That >>>>> could >>>>> be disk failure, indeed. >>>>> >>>>>> mptscsi: ioc1: attempting task abort! (sc=0000010038904c40) >>>>>> scsi1 : destination target 0, lun 0 >>>>>> command = Read (10) 00 75 94 40 00 00 10 00 00 >>>>>> mptscsi: ioc1: task abort: SUCCESS (sc=0000010038904c40) >>>>> >>>>> That does not look like a picture of happiness, indeed, no. You >>>>> have >>>>> SCSI commands aborting. >>>>> >>>> >>>> Well, these messages are not nice of course, since the mpt error >>>> handler >>>> got activated, but in principle a scsi device can recover then. >>>> Unfortunately, the verbosity level of scsi makes it impossbible to >>>> figure out what was actually the problem. Since we suffered from >>>> severe >>>> scsi problems, I wrote quite a number of patches to improve the >>>> situation. >>>> We now at least can understand where the problem came from and also >>>> have >>>> a slightly improved error handling. These are presently for 2.6.22 >>>> only, >>>> but my plan is to sent these upstream for 2.6.28. >>>> >>>> >>>>>> Lustre: 6698:0:(lustre_fsfilt.h:306:fsfilt_setattr()) nobackup- >>>>>> OST0001: slow setattr 100s >>>>>> Lustre: 6698:0:(watchdog.c:312:lcw_update_time()) Expired watchdog >>>>>> for pid 6698 disabled after 103.1261s >>>>> >>>>> Those are just fallout from the above disk situation. >>>> >>>> Probably the device was offlined and actually this also should have >>>> been >>>> printed in the logs. Brock, can you check the device status >>>> (cat /sys/block/sdX/device/state). >>> >>> IO Is still flowing from both OST''s on that OSS, >>> >>> [root at nyx167 ~]# cat /sys/block/sd*/device/state >>> running >>> running >> >> So the device recovered. Is the parallel-scsi? If so it now might run >> at >> a lower scsi speed level, but you should have got domain validation >> messages >> about this (unless you are using a customized driver, which has DV >> disabled). > > Its Fibre Channel for the medium. Direct connected (no loop or switch) > So I am not sure, the driver is the stock one with RHEL4. >Ok, quite different then. I only have very little experience with FC, so no idea what''s wrong with your system now. Cheers, Bernd