Sébastien Buisson
2008-Aug-05 14:56 UTC
[Lustre-discuss] Unexplained messages from Lustre in the syslog
Hi there, We run a cluster with Lustre 1.4.12, and we can see various Lustre messages in the syslog for which we did not find any explanation (and that are not self explanatory for us :) ). Here they are: - "cancel n llog-records failed" Jun 15 06:09:53 s_kernel at galibier1 kernel: LustreError: 23849:0:llog_server.c:432:llog_origin_handle_cancel()) cancel 125 llog-records failed: -22 - "no handle for file close" Jun 21 10:51:22 s_kernel at galibier1 kernel: LustreError: 23682:0:(mds_open.c:1492:mds_close()) @@@ no handle for file close ino 17111163: cookie 0x9590c61c1bfee96e req at e0000002c973a100 x73343089/t0 o35->63db32f1-425e-f044-86e7-ae7ef91215a7 at NET_0x1000000000003_UUID:-1 lens 240/2072 ref 0 fl Interpret:/0/0 rc 0/0 We did not notice any Lustre malfunction when the messages appeared, but our customer worries about them... Does anybody know the meaning of these messages? Thanks in advance. Sebastien.
Andreas Dilger
2008-Aug-18 07:15 UTC
[Lustre-discuss] Unexplained messages from Lustre in the syslog
On Aug 05, 2008 16:56 +0200, S?bastien Buisson wrote:> We run a cluster with Lustre 1.4.12, and we can see various Lustre > messages in the syslog for which we did not find any explanation (and > that are not self explanatory for us :) ). > > Here they are: > - "cancel n llog-records failed" > Jun 15 06:09:53 s_kernel at galibier1 kernel: LustreError: > 23849:0:llog_server.c:432:llog_origin_handle_cancel()) cancel 125 > llog-records failed: -22This indicates that the OST had some partially completed transactions (unlinks most likely) that it tried to report to the MDS as commited, but there was no valid MDS connection. The transaction will be retried from the MDS at a later time.> - "no handle for file close" > Jun 21 10:51:22 s_kernel at galibier1 kernel: LustreError: > 23682:0:(mds_open.c:1492:mds_close()) @@@ no handle for file close ino > 17111163: cookie 0x9590c61c1bfee96e req at e0000002c973a100 x73343089/t0 > o35->63db32f1-425e-f044-86e7-ae7ef91215a7 at NET_0x1000000000003_UUID:-1 > lens 240/2072 ref 0 fl Interpret:/0/0 rc 0/0This is a client trying to close a file after it was evicted. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.