Jeremy Mann
2007-Jan-04 12:39 UTC
[Lustre-discuss] Error messages when the system is under a high load
We are running Lustre 1.5.95 with the Lustre RPMs provided by Lustre for RHEL 4 2.6 kernel. We have several big computational jobs running at the moment and the OST node has a load of 13 and higher. When trying to copy data to the Lustre filesystem, the system hangs and gives us an I/O error. I am attaching the /tmp log file. /var/log/message shows me over and over again: Lustre: 9579:0:(peer.c:238:lnet_debug_peer()) 0@lo 2 up 0 0 0 0 0 0 Lustre: 9579:0:(api-ni.c:1354:LNetCtl()) No ctl for 12345-0@lo Lustre: 9579:0:(peer.c:238:lnet_debug_peer()) 0@lo 2 up 0 0 0 0 0 0 Lustre: 9579:0:(api-ni.c:1354:LNetCtl()) No ctl for 12345-0@lo Lustre: 9579:0:(peer.c:238:lnet_debug_peer()) 0@lo 2 up 0 0 0 0 0 0 Thanks for any help. -- Jeremy Mann jeremy@biochem.uthscsa.edu University of Texas Health Science Center Bioinformatics Core Facility http://www.bioinformatics.uthscsa.edu Phone: (210) 567-2672 -------------- next part -------------- A non-text attachment was scrubbed... Name: lustre-log.1167935415.9594.gz Type: application/gzip Size: 13511 bytes Desc: not available Url : http://mail.clusterfs.com/pipermail/lustre-discuss/attachments/20070104/89ca5e3d/lustre-log.1167935415.9594.bin