search for: ost_connect

Displaying 2 results from an estimated 2 matches for "ost_connect".

2008 Feb 12
0
Lustre-discuss Digest, Vol 25, Issue 17
...rc 0/-22 Lustre: data-OST0000-osc-ffff810139ce4800: Connection to service data- OST0000 via nid 192.168.64.71 at o2ib was lost; in progress operations using this service will wait for recovery to complete. LustreError: 11-0: an error occurred while communicating with 192.168.64.71 at o2ib. The ost_connect operation failed with -16 LustreError: 11-0: an error occurred while communicating with 192.168.64.71 at o2ib. The ost_connect operation failed with -16 I''ve increased the timeout to 300seconds and it has helped marginally. -Aaron On Feb 9, 2008, at 12:06 AM, Tom.Wang wrote: > Hi,...
2008 Feb 04
32
Luster clients getting evicted
on our cluster that has been running lustre for about 1 month. I have 1 MDT/MGS and 1 OSS with 2 OST''s. Our cluster uses all Gige and has about 608 nodes 1854 cores. We have allot of jobs that die, and/or go into high IO wait, strace shows processes stuck in fstat(). The big problem is (i think) I would like some feedback on it that of these 608 nodes 209 of them have in dmesg