Hi All, We have lustre installation where 2 OSS nodes are in the HA mode. It was found that One was stonithed. The var log messages showed following errors before it was stonithed ================================================================================Feb 1 10:35:57 oss5 heartbeat: [8336]: WARN: Gmain_timeout_dispatch: Dispatch function for memory stats took too long to execute: 870 ms (> 100 ms) (GSource: 0x1e6c62a8) Feb 1 10:36:00 oss5 kernel: LustreError: 27684:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:03 oss5 kernel: LustreError: 15913:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:08 oss5 kernel: LustreError: 12380:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:09 oss5 kernel: LustreError: 12261:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:10 oss5 kernel: LustreError: 9713:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:12 oss5 kernel: LustreError: 4114:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:13 oss5 kernel: LustreError: 4092:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:15 oss5 kernel: LustreError: 12398:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:17 oss5 kernel: LustreError: 12283:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:18 oss5 kernel: LustreError: 12325:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:19 oss5 kernel: LustreError: 9752:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:19 oss5 kernel: LustreError: 23057:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:22 oss5 kernel: LustreError: 12428:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:22 oss5 kernel: LustreError: 9679:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:27 oss5 kernel: LustreError: 9686:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:28 oss5 kernel: LustreError: 12385:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:33 oss5 kernel: LustreError: 27687:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:35 oss5 kernel: LustreError: 12264:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:40 oss5 kernel: LustreError: 9784:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:43 oss5 kernel: LustreError: 23117:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:48 oss5 kernel: LustreError: 12265:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:52 oss5 kernel: LustreError: 4103:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:36:57 oss5 kernel: LustreError: 12415:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:02 oss5 kernel: LustreError: 23132:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:05 oss5 kernel: LustreError: 23100:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:07 oss5 kernel: LustreError: 9714:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:07 oss5 kernel: LustreError: 12429:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:07 oss5 kernel: LustreError: 4090:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:10 oss5 kernel: LustreError: 9773:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:14 oss5 kernel: LustreError: 9781:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:19 oss5 kernel: LustreError: 9752:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:25 oss5 kernel: LustreError: 23082:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:32 oss5 kernel: LustreError: 15927:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:41 oss5 kernel: LustreError: 9761:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:50 oss5 kernel: LustreError: 12382:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:53 oss5 kernel: LustreError: 15925:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:55 oss5 kernel: LustreError: 23102:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:37:58 oss5 kernel: LustreError: 9732:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:38:00 oss5 kernel: LustreError: 12442:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:38:02 oss5 kernel: LustreError: 9658:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:38:07 oss5 kernel: LustreError: 12342:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:38:13 oss5 kernel: LustreError: 4108:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:38:20 oss5 kernel: LustreError: 12271:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:38:21 oss5 kernel: LustreError: 9683:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:38:28 oss5 kernel: LustreError: 23118:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:38:37 oss5 kernel: LustreError: 4124:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:38:47 oss5 kernel: LustreError: 9670:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:38:47 oss5 kernel: LustreError: 15920:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:38:51 oss5 kernel: LustreError: 9768:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:38:57 oss5 kernel: LustreError: 23058:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:00 oss5 kernel: LustreError: 9708:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:04 oss5 kernel: LustreError: 4115:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:09 oss5 kernel: LustreError: 9771:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:15 oss5 kernel: LustreError: 15926:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:22 oss5 kernel: LustreError: 12411:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:30 oss5 kernel: LustreError: 12438:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:33 oss5 kernel: LustreError: 12266:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:35 oss5 kernel: LustreError: 27698:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:38 oss5 kernel: LustreError: 9776:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:39 oss5 kernel: LustreError: 27691:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:42 oss5 kernel: LustreError: 15931:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:47 oss5 kernel: LustreError: 9780:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:49 oss5 kernel: LustreError: 9753:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:49 oss5 kernel: LustreError: 12378:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:39:53 oss5 kernel: LustreError: 9766:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 Feb 1 10:40:00 oss5 kernel: LustreError: 9712:0:(filter_io_26.c:669:filter_commitrw_write()) error starting transaction: rc = -30 ================================================================================================later on the other (HA) OSS was also stonithed after outputting messages like above. What could be the problem and what further to look for to diagnose? Thanks and Regards Prithu -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120204/9b1496ea/attachment.html
On Sat, Feb 04, 2012 at 12:41:08AM +0530, Prithu Tiwari wrote:> 27684:0:(filter_io_26.c:669:filter_commitrw_write()) error starting > transaction: rc = -30#define EROFS 30 /* Read-only file system */ This means that the backend filesystem has been remounted read-only. There is likely an error message from ldiskfs earlier in the log. Cheers, Johann -- Johann Lombardi Whamcloud, Inc. www.whamcloud.com