thr3ads.net - Lustre discuss - [Lustre-discuss] More: OSS crashes [Jul 2008]

If this information is useful, please help other people find it:
Share via:

Thomas Roth

2008-Jul-31 18:45 UTC

[Lustre-discuss] More: OSS crashes

Hi all,

I''m still successful in bringing my OSSs to a standstill if not
crashing
them.
Having reduced the number of stress jobs writing to Lustre (stress -d 2 
--hdd-noclean --hdd-bytes 5M) to four, and having reduced the number of 
OSS threads (options ost oss_num_threads=256 in /etc/modprobe.d/lustre), 
the OSS do not freeze entirely any more. Instead after ~ 15 hours,
- all stress jobs have terminated with Input/output error
- the MDT has marked the affected OSTs as Inactive
- the already open connections to the OSS remain active
- interactive collectl, "watch df", top sessions are still working
- the number of ll_ost threads is 256 ( number of ll_ost_io is 257 ?)
- log file writing has obviously stopped after only 10 hours
- already open shells  allow commands like "ps", I can kill some
processes
- new ssh login doesn''t work
- access to disk, as in "ls", brings the system to total freeze

The process table shows six ll_ost_io - threads, all using 38.9% cpu, 
all running for 419:21m. All the rest are sleeping.
The cause can''t be system overloading or simple faulty hardware.  To 
give an impression of what is going on, I''m quoting the last collectl 
record:

##########################################################################################
### RECORD  139  (1217475195.342) (Thu Jul 31 05:33:15 2008) ###

# CPU SUMMARY (INTR, CTXSW & PROC /sec)
# USER  NICE   SYS  WAIT   IRQ  SOFT STEAL  IDLE  INTR  CTXSW  PROC 
RUNQ   RUN   AVG1  AVG5 AVG15
         0         0        14      20        0         5         0 
     58      4255    53K         1         736         6        22.06 
31.28   31.13

# DISK SUMMARY (/sec)
#KBRead RMerged  Reads SizeKB  KBWrite WMerged Writes SizeKB
          0            0               0           0         83740 
   314          861     97

# LUSTRE FILESYSTEM SINGLE OST STATISTICS
#Ost              KBRead   Reads    KBWrite  Writes
OST0004            0           0          40674      63
OST0005            0           0          40858      66
##########################################################################################


That''s not too much for the machine, I''d reckon. And as
mentioned in an
earlier post, I have run the very same ''stress'' test, also
with CPU load
or I/O load only, locally on machines that had crashed earlier. The test 
runs that wrote to disk finished only when the disks where 100% full 
(then formatted plain ext3), the tests with I/O load = 500 and CPU load 
= 1k are running for three days now.  Of course I don''t know how 
reliable these test are.

Looks to me as if a few Lustre threads for some reason can''t  process 
their I/O any more, kind of building up pressure and finally blocking 
all (disk) I/O.
Knowing this reason and how to avoid it would not only relieve these 
servers of some pressure... ;-)

Hm, hardware: the cluster is running Debian Etch, Kernel 2.6.22, Lustre 
1.6.5. The OSS are Supermicro X7DB8 fileservers, Xeon E5320, 8GB RAM, 
with 16 internal disks on two 3ware 9650 RAID controllers, forming two 
OSTs each.

Many thanks for any further hints,
Thomas

Andreas Dilger

2008-Jul-31 22:32 UTC

head link

[Lustre-discuss] More: OSS crashes

On Jul 31, 2008  20:45 +0200, Thomas Roth wrote:> I''m still successful in bringing my OSSs to a standstill if not
crashing
> them.
> Having reduced the number of stress jobs writing to Lustre (stress -d 2 
> --hdd-noclean --hdd-bytes 5M) to four, and having reduced the number of 
> OSS threads (options ost oss_num_threads=256 in /etc/modprobe.d/lustre), 
> the OSS do not freeze entirely any more. Instead after ~ 15 hours,
> - all stress jobs have terminated with Input/output error
> - the MDT has marked the affected OSTs as Inactive
> - the already open connections to the OSS remain active
> - interactive collectl, "watch df", top sessions are still
working
> - the number of ll_ost threads is 256 ( number of ll_ost_io is 257 ?)
> - log file writing has obviously stopped after only 10 hours
> - already open shells  allow commands like "ps", I can kill some
processes
> - new ssh login doesn''t work
> - access to disk, as in "ls", brings the system to total freeze
> 
> The process table shows six ll_ost_io - threads, all using 38.9% cpu, 
> all running for 419:21m. All the rest are sleeping.
> The cause can''t be system overloading or simple faulty hardware.
You need to look at the process table (sysrq-t) and get the stacks of
the running and blocked lustre processes.  Also useful would be the
memory information (sysrq-m) to see if the node is out of free memory,
and if so where it is gone.

If you can still run some commands, then "cat /proc/slabinfo" may
also be useful.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Lustre discuss - Jul 2008 - More: OSS crashes

[Lustre-discuss] More: OSS crashes

[Lustre-discuss] More: OSS crashes