search for: el5lustr

Displaying 3 results from an estimated 3 matches for "el5lustr".

Did you mean: el5lustre
2007 Nov 29
2
Balancing I/O Load
...ncing the work load among OSSs/OSTs. Shouldn''t lustre be doing a better job (by default) of distributing the workload? Charlie Taylor UF HPC Center FWIW, the servers are dual-processor, dual-core Opterons (275s) with 4GB RAM each. They are running CentOS 5 w/ a 2.6.18-8.1.14.el5Lustre (patched lustre, smp kernel) and the deadline I/O scheduler. If it matters, our OSTs are atop LVM2 volumes (for management). The back-end storage is all Fibre-channel RAID (Xyratex). We have tuned the servers and know that we can get roughly 500MB/s per server across a striped *loc...
2008 Mar 04
16
Cannot send after transport endpoint shutdown (-108)
This morning I''ve had both my infiniband and tcp lustre clients hiccup. They are evicted from the server presumably as a result of their high load and consequent timeouts. My question is- why don''t the clients re-connect. The infiniband and tcp clients both give the following message when I type "df" - Cannot send after transport endpoint shutdown (-108). I''ve
2008 Feb 04
32
Luster clients getting evicted
on our cluster that has been running lustre for about 1 month. I have 1 MDT/MGS and 1 OSS with 2 OST''s. Our cluster uses all Gige and has about 608 nodes 1854 cores. We have allot of jobs that die, and/or go into high IO wait, strace shows processes stuck in fstat(). The big problem is (i think) I would like some feedback on it that of these 608 nodes 209 of them have in dmesg