cedric.lambert@bull.net
2007-Apr-13 09:17 UTC
[Lustre-devel] [Bug 12228] New: LBUG after : ptlrpc_check_set() bad phase ebc0de00
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=12228 One of our customer with 1.4.6.1 + Bull patches for Lustre + Quadrics meet the following problem : one of its Lustre Client met a LBUG after a "ptlrpc_check_set()) bad phase ebc0de00". Here is a targeted extract of the Dump report : 00000100:00020000:5:1175080659.947287:864:19441:0:(client.c:577:ptlrpc_check_status()) @@@ type == PTL_RPC_MSG_ERR, err == -122 req@e0000023fcb2e380 x33516079/t0 o4->ost_galibier19.da0.19_UUID@galibier19_UUID:28 lens 328/288 ref 2 fl Rpc:R/0/0 rc 0/-122 00000100:00020000:1:1175236082.477747:864:19441:0:(client.c:577:ptlrpc_check_status()) @@@ type == PTL_RPC_MSG_ERR, err == -122 req@e0000063ff642d80 x38661496/t0 o4->ost_galibier19.da0.27_UUID@galibier19_UUID:28 lens 328/288 ref 2 fl Rpc:R/0/0 rc 0/-122 00000100:00020000:1:1175507735.909770:656:19441:0:(client.c:751:ptlrpc_check_set()) @@@ bad phase ebc0de00 req@e0000063ff64cf80 x44935727/t0 o4->ost_galibier19.da0.27_UUID@galibier19_UUID:28 lens 328/288 ref 3 fl New:I/0/0 rc 0/0 00000100:00040000:1:1175507735.928262:656:19441:0:(client.c:752:ptlrpc_check_set()) LBUG [...] <1>LustreError: dumping log to /tmp/lustre-log.1175507735.19441 [...] ==========================STACK TRACE OF FAILING TASK =========================================================================================STACK TRACE FOR TASK: 0xe00000037c710000 (ptlrpcd) 0 __dump_save_context+0x16c [0xa0000001004352cc] 1 dump_lcrash_save_context+0x2c [0xa00000010043124c] 2 dump_lcrash_configure_header+0xaec [0xa0000001004311ac] 3 dump_generic_execute+0x43c [0xa000000100434f3c] 4 dump_execute+0x12c [0xa00000010042db6c] 5 panic_event+0x76c [0xa00000010042fe8c] 6 notifier_call_chain+0x6c [0xa0000001000a302c] 7 panic+0x16c [0xa00000010007c4ec] 8 ptlrpc_check_set+0x4f6c [0xa000000209bd180c] 9 ptlrpcd_check+0x68c [0xa000000209c78f8c] 10 ptlrpcd+0x6ac [0xa000000209c7988c] 11 kernel_thread_helper+0x2c [0xa0000001000127ec] 12 start_kernel_thread+0x1c [0xa000000100008d1c] This problem is similar to Bugzilla #11083. However, there were 2 LBUG in #11083 : it is not our case. It seems that before this LBUG, RPCs can not be sent because of "quota exceeded" : is it a Quota problem that could be at the origin of this LBUG ?