does your disks have enough space?>From: <nima@amy.udd.htu.se> >To: lustre-discuss@lists.clusterfs.com >Subject: [Lustre-discuss] bonnie++ and lustre 1.04 >Date: Fri, 14 May 2004 20:42:34 +0200 (CEST) > >I made a test with 1 client , 1 mds and 2 osts with bonnie++. >lustre crashes every time i run the test. these here are messages from 1 >of ost machines: >May 14 19:45:16 ng4 >kernel: LustreError: 1306:(ost_handler.c:643:ost_brw_write()) @@@ timeout >on bulk GET req@e8328200 x1557/t0 o4-><?>@:-1 lens 288/248 ref 0 fl >?phase?:/0/0 rc 0/0 >May 14 19:45:26 ng4 >kernel: LustreError: 1306:(niobuf.c:369:ptlrpc_abort_bulk()) Unexpectedly >long timeout: desc daaac000 >May 14 19:45:36 ng4 >kernel: LustreError: 1306:(niobuf.c:369:ptlrpc_abort_bulk()) Unexpectedly >long timeout: desc daaac000 >May 14 19:45:44 ng4 >kernel: LustreError:1249:(socknal_cb.c:2520:ksocknal_find_timed_out_conn()) Timed>out RX from 0xc10ac39a dbaeb800 192.168.53.1 >May 14 19:45:44 ng4 >kernel: LustreError:1249:(socknal_cb.c:2566:ksocknal_check_peer_timeouts()) Timeout>out conn->0xc10ac39a ip 192.168.53.1:32778 >May 14 19:45:44 ng4 >kernel: LustreError: 1249:(socknal.c:1009:ksocknal_destroy_conn())Refusing>to complete a partial receive from 0xc10ac39a, ip 192.168.53.1:32778 >May 14 19:45:44 ng4 >kernel: LustreError: 1249:(socknal.c:1011:ksocknal_destroy_conn()) This >may hang communications and prevent modules from unloading >May 14 19:45:46 ng4 >kernel: LustreError: 1306:(niobuf.c:369:ptlrpc_abort_bulk()) Unexpectedly >long timeout: desc daaac000 >May 14 19:46:26 ng4 last message repeated 4 times > >any idea how to fix the problem ? > >Regards >Nima > >_______________________________________________ >Lustre-discuss mailing list >Lustre-discuss@lists.clusterfs.com >https://lists.clusterfs.com/mailman/listinfo/lustre-discuss_________________________________________________________________ ÏíÓÃÊÀ½çÉÏ×î´óµÄµç×ÓÓʼþϵͳ¡ª MSN Hotmail¡£ http://www.hotmail.com
On Fri, 2004-05-14 at 14:42, nima@amy.udd.htu.se wrote:> I made a test with 1 client , 1 mds and 2 osts with bonnie++. > lustre crashes every time i run the test. these here are messages from 1 > of ost machines:Are there any messages on the client node? This could be a few things, and it''s hard to say from just the OST''s messages. -Phil
I made a test with 1 client , 1 mds and 2 osts with bonnie++. lustre crashes every time i run the test. these here are messages from 1 of ost machines: May 14 19:45:16 ng4 kernel: LustreError: 1306:(ost_handler.c:643:ost_brw_write()) @@@ timeout on bulk GET req@e8328200 x1557/t0 o4-><?>@:-1 lens 288/248 ref 0 fl ?phase?:/0/0 rc 0/0 May 14 19:45:26 ng4 kernel: LustreError: 1306:(niobuf.c:369:ptlrpc_abort_bulk()) Unexpectedly long timeout: desc daaac000 May 14 19:45:36 ng4 kernel: LustreError: 1306:(niobuf.c:369:ptlrpc_abort_bulk()) Unexpectedly long timeout: desc daaac000 May 14 19:45:44 ng4 kernel: LustreError: 1249:(socknal_cb.c:2520:ksocknal_find_timed_out_conn()) Timed out RX from 0xc10ac39a dbaeb800 192.168.53.1 May 14 19:45:44 ng4 kernel: LustreError: 1249:(socknal_cb.c:2566:ksocknal_check_peer_timeouts()) Timeout out conn->0xc10ac39a ip 192.168.53.1:32778 May 14 19:45:44 ng4 kernel: LustreError: 1249:(socknal.c:1009:ksocknal_destroy_conn()) Refusing to complete a partial receive from 0xc10ac39a, ip 192.168.53.1:32778 May 14 19:45:44 ng4 kernel: LustreError: 1249:(socknal.c:1011:ksocknal_destroy_conn()) This may hang communications and prevent modules from unloading May 14 19:45:46 ng4 kernel: LustreError: 1306:(niobuf.c:369:ptlrpc_abort_bulk()) Unexpectedly long timeout: desc daaac000 May 14 19:46:26 ng4 last message repeated 4 times any idea how to fix the problem ? Regards Nima