Martin Vogt
2006-May-19 07:36 UTC
[Lustre-discuss] question about ptlrpc_expire_one_request
Hello clusterfs, I''m using lustre 1.3.2 with gcc 3.3.3 on SuSE 9.0 against vanilla 2.4.24 When I run IOZone in an endless loop I sometime get this error messages: >LustreError: 4370:0:(client.c:816:ptlrpc_expire_one_request()) @@@ timeout (sent at 1097225851) req@ceb2dc00 >x30114/t22042 o4->media-ost1_UUID@NID_192.168.9.11_UUID:6 lens 288/248 ref 3 fl ?phase?:R/4/0 rc 0/0 >LustreError: 4370:0:(import.c:106:ptlrpc_set_import_discon()) OSC_media5_media-ost1_MNT_media5: connection >lost to media-ost1_UUID@NID_192.168.9.11_UUID >LustreError: 4349:0:(recover.c:113:ptlrpc_run_failed_import_upcall()) Invoked upcall /usr/local/lustre/recover.sh >FAILED_IMPORT media-ost1_UUID OSC_media5_media-ost1_MNT_media5 NID_192.168.9.11_UUID My revover upcall script then re-integates the missing server, but my question is why it loses sometimes the connection? Maybe it has something to do with the "high network load"? regards, Martin