Wojciech Turek
2007-Aug-21 08:21 UTC
[Lustre-discuss] LustreError: lock callback timer expired
Hi, We are running lustre 1.6.1 kernel on lustre clients: 2.6.9-55.0.2.EL.cernsmp kernel on lustre servers: 2.6.9-55.EL_lustre-1.6.1smp Today We''ve noticed LustreError in sys log on MDS/MGS server: Aug 21 12:37:31 storage03 kernel: LustreError: 0:0:(ldlm_lockd.c: 214:waiting_locks_callback()) ### lock callback timer expired: evicting client 11a230c6-066c-9ea4-dde6- c59340e1ae39@NET_0x200010a8e0a68_UUID nid 10.142.10.104@tcp1 ns: mds- home-md-MDT0000_UUID lock: 0000010031c61940/0x6d867fda2ace7151 lrc: 1/0,0 mode: CR/CR res: 34905917/1777246764 bits 0x5 rrc: 2 type: IBT flags: 20 remote: 0x2a90aaa09ccaf8cf expref: 305 pid 18988 Aug 21 12:37:32 storage03 kernel: Lustre: 18911:0:(mds_reint.c: 125:mds_finish_transno()) commit transaction for disconnected client 11a230c6-066c-9ea4-dde6-c59340e1ae39: rc 0 on Lustre client: Aug 21 12:37:32 bindloe04 kernel: Lustre: home-md-MDT0000- mdc-00000102261ec400: Connection to service home-md-MDT0000 via nid 10.142.10.3@tcp1 was lost; in progress operations using this service will wait for recovery to complete. Aug 21 12:37:32 bindloe04 kernel: LustreError: 167-0: This client was evicted by home-md-MDT0000; in progress operations using this service will fail. Aug 21 12:37:32 bindloe04 kernel: LustreError: 31873:0:(client.c: 520:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@00000102263b8c00 x722284/t0 o35->home-md-MDT0000_UUID@10.143.245.3@tcp:12 lens 296/448 ref 1 fl Rpc:/0/0 rc 0/0 Aug 21 12:37:32 bindloe04 kernel: LustreError: 31873:0:(client.c: 520:ptlrpc_import_delay_req()) Skipped 10 previous similar messages Aug 21 12:37:32 bindloe04 kernel: LustreError: 31873:0:(file.c: 97:ll_close_inode_openhandle()) inode 34905662 mdc close failed: rc = -108 Aug 21 12:37:32 bindloe04 kernel: LustreError: 31873:0:(file.c: 97:ll_close_inode_openhandle()) Skipped 10 previous similar messages Aug 21 12:37:32 bindloe04 kernel: Lustre: home-md-MDT0000- mdc-00000102261ec400: Connection restored to service home-md-MDT0000 using nid 10.142.10.3@tcp1. Any idea what could cause this error? Regards Wojciech Turek Mr Wojciech Turek Assistant System Manager University of Cambridge High Performance Computing service -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.clusterfs.com/pipermail/lustre-discuss/attachments/20070821/1dc42a62/attachment.html
Kilian CAVALOTTI
2007-Aug-21 18:09 UTC
[Lustre-discuss] LustreError: lock callback timer expired
Hi Wojciech, On Tuesday 21 August 2007 07:21:34 am Wojciech Turek wrote:> We are running lustre 1.6.1 > kernel on lustre clients: 2.6.9-55.0.2.EL.cernsmp > kernel on lustre servers: 2.6.9-55.EL_lustre-1.6.1smp > Today We''ve noticed LustreError in sys log > on MDS/MGS server: > Aug 21 12:37:31 storage03 kernel: LustreError: 0:0:(ldlm_lockd.c: > 214:waiting_locks_callback()) ### lock callback timer expired:> on Lustre client: > Aug 21 12:37:32 bindloe04 kernel: LustreError: 31873:0:(client.c: > 520:ptlrpc_import_delay_req()) @@@ IMP_INVALID req@00000102263b8c00 > x722284/t0 o35->home-md-MDT0000_UUID@10.143.245.3@tcp:12 lens 296/448 > ref 1 fl Rpc:/0/0 rc 0/0> Any idea what could cause this error?It looks quite similar to this, despite the bug title: https://bugzilla.lustre.org/show_bug.cgi?id=13345#c11 From your error messages, I assume you''re using Ethernet. The similar error I got occurred on an Infiniband fabric. So even though this looks like a network interruption error, it seems to be independent of the underlying network type. That''s not very helpful, but at least, there''s already a bug report. Cheers, -- Kilian