Without actual error messages and the version of the installed code, I
don''t think anyone is going to be able to help much.
Personally, the first place I would start is the logs on the MDS. You can also
get the lustre version from that node by running:
rpm -qa | grep lustre
Recently, I also inherited a Lustre system (running 1.8.3) which was exhibiting
similar issues and upgrading all the lustre servers to 1.8.8-wc1 (and CentOS
5.8) seems to have resolved all the issues.
Ron.
> -----Original Message-----
> From: lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-
> bounces at lists.lustre.org] On Behalf Of Jason Brooks
> Sent: June 28, 2012 2:35 PM
> To: lustre-discuss at lists.lustre.org
> Subject: [Lustre-discuss] Lustre newbie
>
> Hello,
>
> I am totally new to lustre. I have inherited a couple of clusters which
have
> a lustre filesystem mounted on each node via infiniband.
>
> one cluster has 56 nodes on it, the other has about 18.
>
> There are 6 lustre servers, five of which are ost''s and the sixth
is the ost.
>
> I have a problem: namely that the lustre filesystem is not mounting at
times,
> or mysteriously unmounts itself. If I try to mount it, at times I will get
an
> error, but I can''t recall what it is. I think the latency values
have
> something to do with it, but in truth, I am kind of at a loss where to
start.
>
> I have luster 1.8 installed. I have the pdf manual by sun, but what it
really
> doesn''t seem to illustrate well is how to step into a running
system. Do any
> of you have any recommended reading I do?
>
> Thanks!
>
> --jason
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss