eeb@clusterfs.com
2007-Apr-23 06:54 UTC
[Lustre-devel] [Bug 12324] New: Spurious errors reported on incoming requests to a server when it times out a client
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=12324 When LNET times out communications with a peer, partially received incoming RPC requests, if any, will complete with failure, and the following message will appear... LustreError: 6403:0:(events.c:172:request_in_callback()) event type 1, status -5, service ost ...however if one such request happens to exhaust a posted request buffer, the following additional errors will appear... LustreError: 6635:0:(pack_generic.c:320:lustre_unpack_msg()) message length 0 too small for magic/version check LustreError: 6635:0:(service.c:487:ptlrpc_server_handle_request()) error unpacking request: ptl 28 from 12345-4@ptl xid 941 ...these messages should be ignored - they are only reported because request_in_callback must queue something to ptlrpc_main to get the buffer recycled in service thread context and this happens to be request which is truncated to 0 bytes because it completed in error. The bug fix should... a) not add such requests to the service''s history list b) mark the request explicitly as received in error so that it is only used to recycle the request buffer.