Displaying 2 results from an estimated 2 matches for "ost_connect".
2008 Feb 12
0
Lustre-discuss Digest, Vol 25, Issue 17
...rc 0/-22
Lustre: data-OST0000-osc-ffff810139ce4800: Connection to service data-
OST0000 via nid 192.168.64.71 at o2ib was lost; in progress operations
using this service will wait for recovery to complete.
LustreError: 11-0: an error occurred while communicating with
192.168.64.71 at o2ib. The ost_connect operation failed with -16
LustreError: 11-0: an error occurred while communicating with
192.168.64.71 at o2ib. The ost_connect operation failed with -16
I''ve increased the timeout to 300seconds and it has helped marginally.
-Aaron
On Feb 9, 2008, at 12:06 AM, Tom.Wang wrote:
> Hi,...
2008 Feb 04
32
Luster clients getting evicted
on our cluster that has been running lustre for about 1 month. I have
1 MDT/MGS and 1 OSS with 2 OST''s.
Our cluster uses all Gige and has about 608 nodes 1854 cores.
We have allot of jobs that die, and/or go into high IO wait, strace
shows processes stuck in fstat().
The big problem is (i think) I would like some feedback on it that of
these 608 nodes 209 of them have in dmesg