On Tue, 2009-10-27 at 17:02 +0100, Arne Brutschy wrote:> Hello,
Hi,
> we currently have a cluster connected with a private GbE network.
> Traffic other than Lustre is minimal (Gridengine task management).
> Nevertheless, we have the MDS loosing connection to the OSS from time to
> time under load.
What makes you think this is related to network saturation?
> As it proved to be critical to Lustre not to lose the connection between
> MDS and OSS, we''re thinking about implementing a separate direct
> connection between the lustre servers, using one interface for internal
> traffic and another one for bulk traffic (client connection).
MDS->OSS traffic is minimal compared to clients. I''m not convinced
that
what you have is really a network saturation problem.
You probably want to purse the connection problems and from there decide
why they are happening before jumping to conclusions.
Bugzilla can be a great resource for trying to figure out what various
error messages might mean.
b.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20091027/c51768a3/attachment.bin